jagomart
digital resources
picture1_Chi Square Test Ppt 69376 | Parallel Chisquare


 260x       Filetype PPTX       File size 0.08 MB       Source: web.njit.edu


File: Chi Square Test Ppt 69376 | Parallel Chisquare
chi square test the chi square test is a popular feature selection method when we have categorical data and classification labels as opposed to regression in a feature selection context ...

icon picture PPTX Filetype Power Point PPTX | Posted on 29 Aug 2022 | 3 years ago
Partial capture of text on file.
        Chi-square test
  • The chi-square test is a popular feature selection 
   method when we have categorical data and 
   classification labels as opposed to regression
  • In a feature selection context we would apply 
   the chi-square test to each feature and rank 
   them chi-square values (or p-values)
  • A parallel solution is to calculate chi-square for 
   all features in parallel at the same time as 
   opposed to one at a time if done serially
                                    Chi-square test
                                                                                   Contingency table
   •    We have two random variables:
         – Label (L): 0 or 1                                                          Feature=A         Feature=B
         – Feature (F): Categorical
   •    Null hypothesis: the two variables are 
        independent of each other (unrelated)                       Label=0           Observed=c1       Observed=c2
   •    Under independence                                                            Expected=X1  Expected=X2
         – P(L,F)= P(D)P(G)
         – P(L=0) = (c1+c2)/n                                       Label=1           Observed=c3       Observed=c4
         – P(F=A) = (c1+c3)/n                                                         Expected=X3       Expected=X4
   •    Expected values
         – E(X1) = P(L=0)P(F=A)n
   •    We can calculate the chi-square statistic for a 
        given feature and the probability that it is                                     d-1                   2
        independent of the label (using the p-value).                            2             (c - x)
   •    Features with very small probabilities deviate                       c =                   i        i
        significantly from the independence assumption                                  å x
        and therefore considered important.                                              i=0            i
   Parallel GPU implementation of chi-square 
          test in CUDA
  • The key here is to organize the data to enable 
   coalescent memory access
  • We define a kernel function that computes the chi-
   square value for a given feature
  • The CUDA architecture automatically distributes the 
   kernel across different GPU cores to be processed 
   simultaneously.
The words contained in this file might help you see if this file matches what you are looking for:

...Chi square test the is a popular feature selection method when we have categorical data and classification labels as opposed to regression in context would apply each rank them values or p parallel solution calculate for all features at same time one if done serially contingency table two random variables label l b f null hypothesis are independent of other unrelated observed c under independence expected x d g n e can statistic given probability that it using value with very small probabilities deviate i significantly from assumption therefore considered important gpu implementation cuda key here organize enable coalescent memory access define kernel function computes architecture automatically distributes across different cores be processed simultaneously...

no reviews yet
Please Login to review.