jagomart
digital resources
picture1_Data Wrangling With Python Pdf 180109 | Data Wrangling With Python


 204x       Filetype PDF       File size 0.23 MB       Source: static1.squarespace.com


File: Data Wrangling With Python Pdf 180109 | Data Wrangling With Python
data wrangling with python pdf python for data analysis data wrangling with pandas numpy and ipython pdf data wrangling with python o reilly pdf python for data analysis data wrangling ...

icon picture PDF Filetype PDF | Posted on 30 Jan 2023 | 2 years ago
Partial capture of text on file.
                                                      DATA WRANGLING WITH PYTHON 
                                                                                
                     V Semester: CSE (DS) 
                          Course Code                   Category               Hours / Week        Credits        Maximum Marks 
                            ACDC05                        Core                L      T      P         C       CIA      SEE        Total 
                                                                              3      1       0        4        30       70         100 
                      Contact Classes: 45         Tutorial Classes: 15          Practical Classes: Nil             Total Classes:60 
                    Prerequisites: Python Programming. 
                     I. COURSE OVERVIEW: 
                     Data wrangling is the process of cleaning and unifying messy and complex data sets for easy access and 
                     analysis. This course describes the importing of data from CSV and PDF files, data clean-up tasks such as 
                     elimination  of  bad  data,  duplicates  and  outliers,  and  data  conditioning  steps  such  as  normalization  and 
                     standardization.  The  course  also  discusses  the  data  exploration  for  correlations  and  associations,  and  for 
                     providing statistical summaries of the given data. Several data visualizations such as plots, charts, maps, 
                     tables are also discussed. Finally, the principles of web scraping, web crawlers and spiders are presented. The 
                     knowledge and skills gained in this course are prerequisites for full-fledged data analysis. 
                      
                     II. COURSE OBJECTIVES: 
                    The students will try to learn: 
                        I     The  concept  and  importance  of  data  wrangling  using  Python. 
                        II    The data cleaning and formatting techniques using Python. 
                       III    The working with Excel, PDF and with non-relational database not supported by SQL using 
                              python. 
                       IV     The application of techniques suitable for Web mining applications. 
                            
                      III. COURSE OUTCOMES: 
                     After successful completion of the course, students should be able to: 
                       CO 1  Outline the concept of and the steps in data wrangling process and the python  Remember 
                                basics necessary for implementing the data wrangling. 
                       CO 2  Summerize  the  parsing  approaches  of  the  Excel  as  well  as  PDF  Files for  Understand 
                                devising techniques to deal with uncommon file types. 
                       CO 3  Distinguish    between   MySQL/PostgreSQL   and   NoSQL    for   storing and                    Analyze 
                                acquiring  of  data  to  and  from  the  relational  and  the  non-relational databases 
                                respectively. 
                       CO 4  Explain  the  operations  involved  in  formatting  and  cleaning  the  data using  Understand 
                                Python for subsequent data analysis. 
                       CO 5  Make use  of  python  libraries  for  identifying  outliers  and  correlations in  the           Apply 
                                data, and visualizing the same efficiently. 
                       CO 6  Choose appropriate method of web scraping and crawling  based on web site model                  Apply 
                                for acquring and storing data from world web within python framework. 
                     
                     IV. SYLLABUS: 
                     MODULE – I: INTRODUCTION TO DATA WRANGLING (09) 
                     What Is Data Wrangling? Importance of Data Wrangling, how is Data Wrangling performed? Tasks of Data 
                     Wrangling,  Data  Wrangling  Tools,  Introduction  to  Python,  Python  Basics,  Data  Meant  to  Be  Read  by 
                     Machines, CSV Data, JSON Data, XML Data. 
                      
                     MODULE – II: WORKING WITH EXCEL FILES AND PDFS (09) 
                     Installing Python Packages, Parsing Excel Files, Getting Started with Parsing, PDFs and Problem Solving in 
                     Python, Programmatic Approaches to PDF Parsing, Converting PDF to Text, Parsing PDFs Using pdf miner, 
                     Acquiring and Storing Data, Databases: A Brief Introduction-Relational Databases: MySQL and PostgreSQL, 
                     Non-Relational Databases: NoSQL, When to use a Simple File, Alternative Data Storage. 
                      
                     MODULE – III: DATA CLEANUP (09) 
                     Why Clean Data? Data Cleanup Basics, Identifying Values  for Data  Cleanup, Formatting Data,  Finding 
                     Outliers and Bad Data, Finding Duplicates, Fuzzy Matching, RegEx Matching. 
                                                                                                                              1 | P a g e  
        Normalizing and Standardizing the Data, Saving the Data, determining suitable Data Cleanup, Scripting the 
        Cleanup, Testing with New Data. 
         
        MODULE – IV: DATA EXPLORATION AND ANALYSIS (09) 
        Exploring  Data,  Importing  Data,  Exploring  Table  Functions,  Joining  Numerous  Datasets,  Identifying 
        Correlations, Identifying Outliers, Creating Groupings, Analyzing Data - Separating and Focusing the Data, 
        Presenting Data, Visualizing the Data, Charts, Time-Related Data, Maps, Interactives, Words, Images, Video, 
        and Illustrations, Presentation Tools, Publishing the Data - Open-Source Platforms. 
         
        MODULE – V: WEB SCRAPING (09) 
        What to Scrape and How, analyzing a Web Page, Network/Timeline, interacting with JavaScript, In-Depth 
        Analysis of a Page, Getting Pages, Reading a Web Page - Reading a Web Page with LXML and XPath, 
        Advanced Web Scraping - Browser-Based Parsing, Screen Reading with Selenium, Screen Reading with 
        Ghost.Py, Spidering the Web - Building a Spider with Scrapy, Crawling Whole Websites with Scrapy. 
         V. TEXTBOOKS: 
         1.  Jacqueline Kazil& Katharine Jarmul,” Data Wrangling with Python”, O’Reilly MediaInc., 2016. 
            
         VI. REFERENCE BOOKS: 
         1.  Dr. Tirthajyoti Sarkar, Shubhadeep,” Data Wrangling with Python: Creating actionable data from raw 
          sources”, Packt Publishing Ltd., 2019. 
         2.  Stefanie Molin,” Hands-On Data Analysis with Pandas”, Packt Publishing Ltd.,2019 
         3.  Allan Visochek,” Practical Data Wrangling”, Packt Publishing Ltd., 2017 
         4.  TyeRattenbury, Joseph M. Hellerstein, Jeffrey Heer, Sean Kandel, Connor Carreras,” Principles of Data 
          Wrangling: Practical Techniques for Data Preparation”, O’Reilly Media Inc., 2017 
         
         VII. WEB REFERENCES: 
         1.  http://www.gbv.de/dms/ilmenau/toc/827365454.PDF 
         2.  https://www.udemy.com/course/data-wrangling-with-python/ 
         3.  http://www.openculture.com/free-online-data-science-courses 
         4.  https://www.classcentral.com/course/dataanalysiswithpython-11177 
                                 
                                                2 | P a g e  
The words contained in this file might help you see if this file matches what you are looking for:

...Data wrangling with python pdf for analysis pandas numpy and ipython o reilly github pdfdrive book download jupyter often applicable factual information such as measurements or statistics used a basis argument discussion calculation are abundant readily available h gleason jr comprehensive on economic growth is published jacoby in digital form that can be transmitted processed output from sensing device body contains both useful irrelevant redundant must to make sense dates singular plural how use take life of their own completely separate the original date it occurs two constructions noun e g income using verb modifiers these many several but not prime numbers reference pronouns they you an abstract collective eg this few marked by pronoun versions standard more common print apparently because corporate style publishers requires smith himself homemaker journalist mixes summaries social science anecdotes interviews couples which men have chosen been forced circumstances primary care ch...

no reviews yet
Please Login to review.