jagomart
digital resources
picture1_Science Ppt 70878 | Introduction To Data Science 5 13


 171x       Filetype PPTX       File size 1.28 MB       Source: www.tnstate.edu


File: Science Ppt 70878 | Introduction To Data Science 5 13
outline data big data and challenges data science introduction why data science data scientists what do they do major concentration in data science what courses to take data all around ...

icon picture PPTX Filetype Power Point PPTX | Posted on 30 Aug 2022 | 3 years ago
Partial capture of text on file.
            Outline
   Data, Big Data and Challenges
   Data Science
    Introduction
    Why Data Science
   Data Scientists
    What do they do?
   Major/Concentration in Data Science
    What courses to take.
         Data All Around
   Lots of data is being collected 
   and warehoused 
    Web data, e-commerce
    Financial transactions, bank/credit 
    transactions
    Online trading and purchasing
    Social Network
        How Much Data Do We have?
     Google processes 20 PB a day (2008)
     Facebook has 60 TB of daily logs
     eBay has 6.5 PB of user data + 50 TB/day 
     (5/2009)
     1000 genomes project: 200 TB
     Cost of 1 TB of disk: $35
     Time to read 1 TB disk: 3 hrs 
         (100 MB/s)
             Big Data
   Big Data is any data that is expensive to manage 
   and hard to extract value from 
    Volume
      The size of the data
    Velocity
      The latency of data processing relative to the 
      growing demand for interactivity
    Variety and Complexity
      the diversity of sources, formats, quality, structures.
         Big Data
The words contained in this file might help you see if this file matches what you are looking for:

...Outline data big and challenges science introduction why scientists what do they major concentration in courses to take all around lots of is being collected warehoused web e commerce financial transactions bank credit online trading purchasing social network how much we have google processes pb a day facebook has tb daily logs ebay user genomes project cost disk time read hrs mb s any that expensive manage hard extract value from volume the size velocity latency processing relative growing demand for interactivity variety complexity diversity sources formats quality structures...

no reviews yet
Please Login to review.