当前位置: 首页>>代码示例 >>用法及示例精选 >>正文


Python cuml.dask.feature_extraction.text.TfidfTransformer用法及代码示例


用法:

class cuml.dask.feature_extraction.text.TfidfTransformer(*, client=None, verbose=False, **kwargs)

分布式TF-IDF转换器

例子

import cupy as cp
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
from cuml.dask.common import to_sparse_dask_array
from cuml.dask.naive_bayes import MultinomialNB
import dask
from cuml.dask.feature_extraction.text import TfidfTransformer

# Create a local CUDA cluster
cluster = LocalCUDACluster()
client = Client(cluster)

# Load corpus
twenty_train = fetch_20newsgroups(subset='train',
                        shuffle=True, random_state=42)
cv = CountVectorizer()
xformed = cv.fit_transform(twenty_train.data).astype(cp.float32)
X = to_sparse_dask_array(xformed, client)

y = dask.array.from_array(twenty_train.target, asarray=False,
                    fancy=False).astype(cp.int32)

mutli_gpu_transformer = TfidfTransformer()
X_transormed = mutli_gpu_transformer.fit_transform(X)
X_transormed.compute_chunk_sizes()

model = MultinomialNB()
model.fit(X_transormed, y)
model.score(X_transormed, y)

输出:

array(0.93264981)

相关用法


注:本文由纯净天空筛选整理自rapids.ai大神的英文原创作品 cuml.dask.feature_extraction.text.TfidfTransformer。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。