当前位置: 首页>>代码示例 >>用法及示例精选 >>正文


Python sklearn.Binarizer()用法及代码示例


sklearn.preprocessing.Binarizer()是一种属于预处理模块的方法。它在离散连续特征值中起关键作用。

范例1:
一个8位灰度图像的像素值的连续数据的值范围在0(黑色)和255(白色)之间,并且需要它是黑白的。因此,使用Binarizer()可以设置一个阈值,将像素值从0-127转换为0和128-255转换为1。

范例2:
一个机器记录具有“Success Percentage”作为特征。这些值是连续的,范围从10%到99%,但是研究人员只是想使用此数据基于其他给定参数来预测机器的通过或失败状态。


用法:

sklearn.preprocessing.Binarizer(threshold, copy)

参数:

threshold:[float, optional] Values less than or equal to threshold is mapped to 0, else to 1. By default threshold value is 0.0.
copy :[boolean, optional] If set to False, it avoids a copy. By default it is True.

返回:

Binarized Feature values

下载数据集:
转到链接并下载Data.csv

下面是解释sklearn的Python代码.Binarizer()

# Python code explaining how 
# to Binarize feature values 
   
""" PART 1 
    Importing Libraries """
   
import numpy as np 
import matplotlib.pyplot as plt 
import pandas as pd 
  
# Sklearn library  
from sklearn import preprocessing 
  
""" PART 2 
    Importing Data """
   
data_set = pd.read_csv( 
        'C:\\Users\\dell\\Desktop\\Data_for_Feature_Scaling.csv') 
data_set.head() 
  
# here Features - Age and Salary columns  
# are taken using slicing 
# to binarize values 
age = data_set.iloc[:, 1].values 
salary = data_set.iloc[:, 2].values 
print ("\nOriginal age data values:\n",  age) 
print ("\nOriginal salary data values:\n",  salary) 
  
""" PART 4 
    Binarizing values """
  
from sklearn.preprocessing import Binarizer 
  
x = age 
x = x.reshape(1, -1) 
y = salary 
y = y.reshape(1, -1) 
  
# For age, let threshold be 35 
# For salary, let threshold be 61000 
binarizer_1 = Binarizer(35) 
binarizer_2 = Binarizer(61000) 
  
# Transformed feature 
print ("\nBinarized age:\n", binarizer_1.fit_transform(x)) 
  
print ("\nBinarized salary:\n", binarizer_2.fit_transform(y))

输出:

   Country  Age  Salary  Purchased
0   France   44   72000          0
1    Spain   27   48000          1
2  Germany   30   54000          0
3    Spain   38   61000          0
4  Germany   40    1000          1

Original age data values:
 [44 27 30 38 40 35 78 48 50 37]

Original salary data values:
 [72000 48000 54000 61000  1000 58000 52000 79000 83000 67000]

Binarized age:
 [[1 0 0 1 1 0 1 1 1 1]]

Binarized salary:
 [[1 0 0 0 0 0 0 1 1 1]]


相关用法


注:本文由纯净天空筛选整理自Mohit Gupta_OMG 大神的英文原创作品 sklearn.Binarizer() in Python。非经特殊声明,原始代码版权归原作者所有,本译文未经允许或授权,请勿转载或复制。