Improving data quality in data analytics & machine learning
Год выпуска: 1/2025
Производитель: Udemy
Сайт производителя:
https://www.udemy.com/course/dataqc_x/
Автор: Mike X Cohen
Продолжительность: 5h 22m 24s
Тип раздаваемого материала: Видеоурок
Язык: Английский
Субититры: Английский
Описание:
What you'll learn
- Strategies for increasing data quality
- Ways to assess data quality
- Interpreting data visualizations
- How to spot problems in data
Requirements
- Interest in working with data
- Interest in knowing more about data quality
- Some Python skills are useful for the optional coding videos
Description
All of our decisions are based on data. Our sense organs gather data, our memories are data, and our gut-instincts are data. If you want to make good decisions, you need to have high-quality data.
This course is about data quality: What it means, why it's important, and how you can increase the quality of your data.
In this course, you will learn:
- High-level strategies for ensuring high data quality, including terminology, data documentation and management, and the different research phases in which you can check and increase data quality.
- Qualitative and quantitative methods for evaluating data quality, including visual inspection, error rates, and outliers. Python code is provided to see how to implement these visualizations and scoring methods using pandas, numpy, seaborn, and matplotlib.
- Specific data methods and algorithms for cleaning data and rejecting bad or unusual data. As above, Python code is provided to see how to implement these procedures using pandas, numpy, seaborn, and matplotlib.
This course is for
- Data practitioners who want to understand both the high-level strategies and the low-level procedures for evaluating and improving data quality.
- Managers, clients, and collaborators who want to understand the importance of data quality, even if they are not working directly with data.
Who this course is for:
- Data science practitioners
- Data scientist students
- Managers or colleagues who work with data practitioners
Формат видео: MP4
Видео: avc, 1280x720, 16:9, 30.000 к/с, 361 кб/с
Аудио: aac lc sbr, 44.1 кгц, 62.8 кб/с, 2 аудио
Изменения/Changes
The 2025/1 version has increased by 1 minute compared to 2022/9. The course quality has also been reduced from 1080p to 720p.
MediaInfo
General
Complete name : D:\2\Udemy - Improving data quality in data analytics & machine learning (1.2025)\07. Outliers and missing data\5. Dealing with missing data.mp4
Format : MPEG-4
Format profile : Base Media
Codec ID : isom (isom/iso2/avc1/mp41)
File size : 19.8 MiB
Duration : 6 min 26 s
Overall bit rate : 430 kb/s
Frame rate : 30.000 FPS
Recorded date : 2025-02-01 16:07:16.5907049+03:30
Writing application : Lavf61.9.100
Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Main@L3.1
Format settings : CABAC / 4 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 4 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 6 min 26 s
Bit rate : 361 kb/s
Nominal bit rate : 1 600 kb/s
Width : 1 280 pixels
Height : 720 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 30.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.013
Stream size : 16.6 MiB (84%)
Writing library : x264 core 148
Encoding settings : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x1:0x111 / me=umh / subme=6 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=22 / lookahead_threads=3 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=60 / keyint_min=6 / scenecut=0 / intra_refresh=0 / rc_lookahead=60 / rc=cbr / mbtree=1 / bitrate=1600 / ratetol=1.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / vbv_maxrate=1600 / vbv_bufsize=3200 / nal_hrd=none / filler=0 / ip_ratio=1.40 / aq=1:1.00
Codec configuration box : avcC
Audio
ID : 2
Format : AAC LC SBR
Format/Info : Advanced Audio Codec Low Complexity with Spectral Band Replication
Commercial name : HE-AAC
Format settings : Explicit
Codec ID : mp4a-40-2
Duration : 6 min 26 s
Bit rate mode : Constant
Bit rate : 62.8 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 44.1 kHz
Frame rate : 21.533 FPS (2048 SPF)
Compression mode : Lossy
Stream size : 2.89 MiB (15%)
Title : English
Language : English
Default : Yes
Alternate group : 1