當前位置: 首頁>>代碼示例 >>用法及示例精選 >>正文


Python cuml.dask.feature_extraction.text.TfidfTransformer用法及代碼示例


用法:

class cuml.dask.feature_extraction.text.TfidfTransformer(*, client=None, verbose=False, **kwargs)

分布式TF-IDF轉換器

例子

import cupy as cp
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
from cuml.dask.common import to_sparse_dask_array
from cuml.dask.naive_bayes import MultinomialNB
import dask
from cuml.dask.feature_extraction.text import TfidfTransformer

# Create a local CUDA cluster
cluster = LocalCUDACluster()
client = Client(cluster)

# Load corpus
twenty_train = fetch_20newsgroups(subset='train',
                        shuffle=True, random_state=42)
cv = CountVectorizer()
xformed = cv.fit_transform(twenty_train.data).astype(cp.float32)
X = to_sparse_dask_array(xformed, client)

y = dask.array.from_array(twenty_train.target, asarray=False,
                    fancy=False).astype(cp.int32)

mutli_gpu_transformer = TfidfTransformer()
X_transormed = mutli_gpu_transformer.fit_transform(X)
X_transormed.compute_chunk_sizes()

model = MultinomialNB()
model.fit(X_transormed, y)
model.score(X_transormed, y)

輸出:

array(0.93264981)

相關用法


注:本文由純淨天空篩選整理自rapids.ai大神的英文原創作品 cuml.dask.feature_extraction.text.TfidfTransformer。非經特殊聲明,原始代碼版權歸原作者所有,本譯文未經允許或授權,請勿轉載或複製。