本文简要介绍python语言中 sklearn.neighbors.BallTree
的用法。
用法:
class sklearn.neighbors.BallTree(X, leaf_size=40, metric='minkowski', **kwargs)
BallTree 用于快速广义N-point 问题
在用户指南中阅读更多信息。
- X:形状类似数组 (n_samples, n_features)
n_samples 是数据集中的点数,n_features 是参数空间的维度。注意:如果 X 是 C-contiguous 双精度数组,则不会复制数据。否则,将制作内部副本。
- leaf_size:正整数,默认=40
切换到蛮力的点数。更改leaf_size 不会影响查询结果,但会显著影响查询速度和存储构造树所需的内存。存储树所需的内存量大约为 n_samples /leaf_size。对于指定的
leaf_size
,叶节点保证满足leaf_size <= n_points <= 2 * leaf_size
,除了n_samples < leaf_size
的情况。- metric:str 或 DistanceMetric 对象
用于树的距离度量。默认=‘minkowski’,p=2(即欧几里德度量)。有关可用指标的列表,请参阅 DistanceMetric 类的文档。 ball_tree.valid_metrics 给出了对 BallTree 有效的指标列表。
- Additional keywords are passed to the distance metric class.:
- Note: Callable functions in the metric parameter are NOT supported for KDTree:
- and Ball Tree. Function call overhead will result in very poor performance.:
- data:内存视图
训练数据
参数:
属性:
例子:
查询k-nearest 邻居
>>> import numpy as np >>> from sklearn.neighbors import BallTree >>> rng = np.random.RandomState(0) >>> X = rng.random_sample((10, 3)) # 10 points in 3 dimensions >>> tree = BallTree(X, leaf_size=2) >>> dist, ind = tree.query(X[:1], k=3) >>> print(ind) # indices of 3 closest neighbors [0 3 1] >>> print(dist) # distances to 3 closest neighbors [ 0. 0.19662693 0.29473397]
pickle 和解开一棵树。请注意,树的状态保存在 pickle 操作中:在 unpickle 时不需要重建树。
>>> import numpy as np >>> import pickle >>> rng = np.random.RandomState(0) >>> X = rng.random_sample((10, 3)) # 10 points in 3 dimensions >>> tree = BallTree(X, leaf_size=2) >>> s = pickle.dumps(tree) >>> tree_copy = pickle.loads(s) >>> dist, ind = tree_copy.query(X[:1], k=3) >>> print(ind) # indices of 3 closest neighbors [0 3 1] >>> print(dist) # distances to 3 closest neighbors [ 0. 0.19662693 0.29473397]
查询给定半径内的邻居
>>> import numpy as np >>> rng = np.random.RandomState(0) >>> X = rng.random_sample((10, 3)) # 10 points in 3 dimensions >>> tree = BallTree(X, leaf_size=2) >>> print(tree.query_radius(X[:1], r=0.3, count_only=True)) 3 >>> ind = tree.query_radius(X[:1], r=0.3) >>> print(ind) # indices of neighbors within distance 0.3 [3 0 1]
计算一个高斯核密度估计:
>>> import numpy as np >>> rng = np.random.RandomState(42) >>> X = rng.random_sample((100, 3)) >>> tree = BallTree(X) >>> tree.kernel_density(X[:3], h=0.1, kernel='gaussian') array([ 6.94114649, 7.83281226, 7.2071716 ])
计算 two-point 自相关函数
>>> import numpy as np >>> rng = np.random.RandomState(0) >>> X = rng.random_sample((30, 3)) >>> r = np.linspace(0, 1, 5) >>> tree = BallTree(X) >>> tree.two_point_correlation(X, r) array([ 30, 62, 278, 580, 820])
相关用法
- Python sklearn BayesianGaussianMixture用法及代码示例
- Python sklearn BaggingRegressor用法及代码示例
- Python sklearn BaggingClassifier用法及代码示例
- Python sklearn BayesianRidge用法及代码示例
- Python sklearn Birch用法及代码示例
- Python sklearn BernoulliRBM用法及代码示例
- Python sklearn Bunch用法及代码示例
- Python sklearn BernoulliNB用法及代码示例
- Python sklearn Binarizer用法及代码示例
- Python sklearn jaccard_score用法及代码示例
- Python sklearn WhiteKernel用法及代码示例
- Python sklearn CalibrationDisplay.from_predictions用法及代码示例
- Python sklearn VotingRegressor用法及代码示例
- Python sklearn gen_batches用法及代码示例
- Python sklearn ExpSineSquared用法及代码示例
- Python sklearn MDS用法及代码示例
- Python sklearn adjusted_rand_score用法及代码示例
- Python sklearn MLPClassifier用法及代码示例
- Python sklearn train_test_split用法及代码示例
- Python sklearn RandomTreesEmbedding用法及代码示例
- Python sklearn GradientBoostingRegressor用法及代码示例
- Python sklearn GridSearchCV用法及代码示例
- Python sklearn log_loss用法及代码示例
- Python sklearn r2_score用法及代码示例
- Python sklearn ndcg_score用法及代码示例
注:本文由纯净天空筛选整理自scikit-learn.org大神的英文原创作品 sklearn.neighbors.BallTree。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。