What is the pandas library in Python?

Pandas is a data analysis library in Python that allows users to easily read and manipulate different types of data.

How can I install pandas?

You can install pandas using the command 'pip install pandas' in your terminal.

Why use Jupyter notebooks for working with pandas?

Jupyter notebooks allow visualization of data directly in the browser, which helps in better understanding and analysis of the data.

What type of files can pandas read?

Pandas can read various files types including CSV and Excel files.

What is the first step to start using pandas?

The first step is to install pandas and import it in your Python script using 'import pandas as pd'.

How can you view all columns of a data frame in Jupyter notebooks?

You can set the option using 'pd.set_option('display.max_columns', 85)' to display all the columns.

What dataset is used in this video series for analysis?

The dataset used is the Stack Overflow Developer Survey data.

How can you see the shape of a data frame in pandas?

You can use the 'shape' attribute like 'df.shape' to see the rows and columns of a data frame.

What method can be used to view the first few rows of a data frame?

You can use the 'head()' method to view the first few rows of a data frame.

How can I learn more about using Jupyter notebooks?

The creator of the video suggests a tutorial is available with more detailed instructions on using Jupyter notebooks.

Python Pandas Tutorial (Part 1): Getting Started with Data Analysis - Installation and Loading Data

00:23:01

https://www.youtube.com/watch?v=ZyhVh-qRZPA

Resumo

TLDRThis video is part of a series teaching how to use the pandas library in Python for data analysis. Pandas allows for easy manipulation and analysis of data from various file types like CSV and Excel. The tutorial starts with installation instructions for pandas and Jupyter Notebook, a tool used for interactive data analysis. It demonstrates how to set up a working environment, including importing data and viewing its structure within a Jupyter Notebook. The video uses real-world data from the Stack Overflow Developer Survey for practical examples. The tutorial also provides detailed steps on how to install necessary tools, such as creating a virtual environment (optional), and setting up the system to begin with data reading and analysis. It briefly mentions the usefulness of the Jupyter interface for data visualization and how to adjust display settings to accommodate data inspection needs. This lays the foundation for performing various data manipulation operations using pandas in forthcoming videos. The session ends with an introduction of the dataset used for demonstrations and guidance on utilizing certain basic pandas functionalities.

Conclusões

📚 Pandas is essential for data analysis in Python, especially for data science.
🔧 Installation of pandas can be done using pip in a virtual environment or directly in your system.
📊 Jupyter notebooks are a convenient tool for data visualization when using pandas.
📂 Pandas can easily handle CSV and Excel files for data analysis.
🔍 Understanding and setting up the data viewing options in Jupyter can help manage large datasets effectively.
🗃️ The tutorial uses real-world data from Stack Overflow Developer Survey for demonstration.
👨‍💻 Importing data into pandas is straightforward with 'pd.read_csv()'.
🔍 Viewing dataframes in Jupyter offers interactive inspection through features like 'df.head()' and 'df.info()'.
🛠️ Adjusting settings in Jupyter allows full inspection of dataframe columns and rows.
📈 The video sets the stage for more advanced data manipulation techniques in pandas.

Linha do tempo

00:00:00 - 00:05:00
In this video series, we will be learning how to use the pandas library in Python, a crucial tool for data analysis. Pandas is highly popular due to its abilities to easily read and work with data in formats like CSV and Excel, and to perform data analysis efficiently. We will start by installing pandas, downloading the relevant dataset, and setting up a Jupyter notebook environment for coding and analysis. The video also mentions a sponsor, brilliant.org, urging viewers to check them out.
00:05:00 - 00:10:00
To begin setting up, we install pandas and Jupyter using pip commands. The speaker opts to use Jupyter notebooks despite some initial hesitation. The notebooks provide an advantageous interface to visualize data within the browser, which is useful when working with pandas. A project folder is created on the desktop, and the Stack Overflow developer survey data is downloaded and saved within it. This real-world data is chosen for its relatability and ability to maintain interest in the tutorial scenarios.
00:10:00 - 00:15:00
Files are organized within the project directory, and important files are identified such as the survey results CSV and its schema. The setup process is continued by navigating to the project directory via the terminal and initiating a Jupyter server, which runs locally in a browser. A new Jupyter notebook is created, and pandas is imported to begin data manipulation. The basics of loading and inspecting data with pandas are demonstrated, focusing on CSV file reading and initial data exploration.
00:15:00 - 00:23:01
Key methods for data exploration in pandas are showcased. The video demonstrates how to load data into a DataFrame and explores its contents using commands like `df.shape` and `df.info` to examine data structure and types. Adjustments are made to display settings in Jupyter to ensure all columns and pertinent data are viewable. The video introduces methods like `df.head()` and `df.tail()` for previewing data subsets. There's a brief mention of the sponsor, highlighting courses that supplement learning.

Mostrar mais

Mapa mental

Vídeo de perguntas e respostas

What is the pandas library in Python?
Pandas is a data analysis library in Python that allows users to easily read and manipulate different types of data.
How can I install pandas?
You can install pandas using the command 'pip install pandas' in your terminal.
Why use Jupyter notebooks for working with pandas?
Jupyter notebooks allow visualization of data directly in the browser, which helps in better understanding and analysis of the data.
What type of files can pandas read?
Pandas can read various files types including CSV and Excel files.
What is the first step to start using pandas?
The first step is to install pandas and import it in your Python script using 'import pandas as pd'.
How can you view all columns of a data frame in Jupyter notebooks?
You can set the option using 'pd.set_option('display.max_columns', 85)' to display all the columns.
What dataset is used in this video series for analysis?
The dataset used is the Stack Overflow Developer Survey data.
How can you see the shape of a data frame in pandas?
You can use the 'shape' attribute like 'df.shape' to see the rows and columns of a data frame.
What method can be used to view the first few rows of a data frame?
You can use the 'head()' method to view the first few rows of a data frame.
How can I learn more about using Jupyter notebooks?
The creator of the video suggests a tutorial is available with more detailed instructions on using Jupyter notebooks.

Ver mais resumos de vídeos

Obtenha acesso instantâneo a resumos gratuitos de vídeos do YouTube com tecnologia de IA!

Legendas

Rolagem automática:

00:00:00
hey there how's it going everybody in
00:00:01
this series of videos we're going to be
00:00:03
learning how to use the pandas library
00:00:04
and Python so pandas is a data analysis
00:00:07
library that allows us to easily read in
00:00:09
and work with different types of data so
00:00:12
we can use this to analyze CSV files
00:00:14
Excel files and other similar formats so
00:00:17
if you're getting into the data science
00:00:18
field then this library is going to be
00:00:20
essential to learn it's one of the most
00:00:22
downloaded packages for Python and
00:00:24
that's for a great reason so not only
00:00:26
does it allow us to easily read in and
00:00:28
analyze data but it also has great
00:00:30
performance since it built on top of
00:00:32
numpy and we'll be learning how to do
00:00:34
different types of an analysis or if
00:00:36
data analysis in this series so in this
00:00:38
video we're going to be going over how
00:00:40
to get pandas installed how to download
00:00:42
the data that I'll be using for most of
00:00:44
this series and also how to get all of
00:00:47
this open in a jupiter notebook so that
00:00:49
we're ready to do some coding and
00:00:50
analysis now i'd also like to mention
00:00:52
that we do have a sponsor for the series
00:00:54
of videos and that is brilliant org so i
00:00:57
really want to thank brilliant for
00:00:58
sponsoring this series and it would be
00:01:00
great if you all can check them out
00:01:01
using the link in the description
00:01:02
section below and support the sponsors
00:01:04
and I'll talk more about their services
00:01:06
in just a bit so with that said let's go
00:01:08
ahead and get started so first of all
00:01:10
let's install pandas so I'm using a
00:01:13
clean virtual environment for this
00:01:14
series but you don't have to use a
00:01:16
virtual environment if you don't want to
00:01:17
if you don't know what a virtual
00:01:19
environment is and would like to learn
00:01:21
more about those then I'll be sure to
00:01:23
leave a link to my video on that topic
00:01:25
in the description section below if
00:01:27
anyone is interested so it's really easy
00:01:30
to install pandas here all we need to do
00:01:32
is say pip install pianist and we will
00:01:37
let this run through and once we have
00:01:40
pandas installed then let's also install
00:01:43
Jupiter so that we can use Jupiter
00:01:45
notebooks now I was a bit hesitant to
00:01:48
use Jupiter for this series because some
00:01:50
people find it difficult to get the hang
00:01:52
of but honestly if you're going to be
00:01:54
doing a lot of work with pandas then
00:01:56
it's definitely a nice tool to use for
00:01:58
this so now it's not necessary so you
00:02:01
should be able to follow along with this
00:02:02
series just fine if you're using a
00:02:04
regular editor but Jupiter notebooks
00:02:06
allows us to actually see our data more
00:02:09
easily by using the browser to print out
00:02:11
our data and tables that make it
00:02:13
year to visualize so I'm gonna use it in
00:02:16
the series but you don't have to in
00:02:18
order to follow along so to install
00:02:20
Jupiter I want to say pip install and
00:02:24
this is going to be Jupiter lab and this
00:02:28
is spelled Ju py ter la B Jupiter lab so
00:02:34
we'll get that installed now I'm not
00:02:36
going to go into a deep dive and how to
00:02:38
use Jupiter in this series I'm mainly
00:02:40
going to focus on pandas but if you'd
00:02:42
like a detailed overview of how to use
00:02:44
Jupiter then I do have a video on how to
00:02:46
use Jupiter in depth and I'll leave a
00:02:48
link to that video in the description
00:02:49
section below if anyone would like to
00:02:52
learn more about the details of using
00:02:54
that ok so now we have pandas and
00:02:56
Jupiter notebooks installed now we're
00:02:58
going to need to download the data that
00:03:00
I'll be using for most of this series
00:03:02
now for anyone who's been watching my
00:03:04
latest videos you know that I like to
00:03:06
use the stackoverflow developer survey
00:03:08
for different kinds of data analysis now
00:03:10
the reason that I like to use this data
00:03:12
is because it's real world data and it
00:03:15
has a lot of data in there that I think
00:03:16
would be interesting to most people who
00:03:18
are watching these types of videos I've
00:03:20
seen some other tutorials where the data
00:03:22
just seems kind of unrealistic and not
00:03:24
very relatable
00:03:26
so hopefully using this data will keep
00:03:28
people interested and also give you a
00:03:30
good idea of what it's like to actually
00:03:32
download download real data from a
00:03:35
source and start analyzing it with
00:03:37
pandas so to download this data I have
00:03:40
this pulled up here in the browser we
00:03:42
can go over to the Stack Overflow survey
00:03:45
results page now this is easy to find if
00:03:47
you just google it but just to keep
00:03:49
things easy I'll have a link to this
00:03:51
download page in a description section
00:03:53
as well ok now on this page you can
00:03:57
download the data in CSV form for any
00:04:00
year that they have available and now
00:04:02
I'm going to go ahead and download the
00:04:04
2019 data which is the top data here so
00:04:08
I'm going to download this CSV here and
00:04:12
then we'll click on download again and
00:04:15
this should go ahead and download this
00:04:18
for us ok it did and now I'm going to
00:04:22
open this in my finder here and I'm
00:04:25
going to unzip this data it comes
00:04:27
zip drive and once that data is
00:04:29
downloaded and unzipped I'm going to go
00:04:32
ahead and drag that folder to a folder
00:04:34
here on my desktop and that's where
00:04:37
we'll also create a notebook and analyze
00:04:39
this data so real quick I don't have
00:04:42
this open let me open up this pandas
00:04:48
demo folder and this will open this and
00:04:51
find her and now I will take the data
00:04:54
and drag this into this pandas demo
00:04:56
folder that is on my desktop so your
00:04:59
projects can be anywhere but I just had
00:05:02
I just created a project folder here on
00:05:05
my desktop called pandas demo and it's
00:05:07
completely empty except for the data
00:05:09
that we just dragged in here so now I'm
00:05:12
going to rename this since this is kind
00:05:14
of a long name here I'm just going to
00:05:16
rename this to data that was named
00:05:19
developer survey 2019 but I'm just gonna
00:05:21
call that data so that it's easy for us
00:05:23
to find that within our script okay so
00:05:26
what files do we have here in the
00:05:28
directory that we unzipped in this data
00:05:30
directory let me make this a little
00:05:32
larger here okay so first of all if you
00:05:36
download data that comes with a readme
00:05:39
then this is usually helpful we have a
00:05:41
readme file right here it tells you what
00:05:43
these other files are going to be so in
00:05:46
this case we have this survey results
00:05:48
public dot CSV and that contains the
00:05:51
main survey results one respondent per
00:05:54
row and one column per answer and the
00:05:57
survey results schema here has the
00:06:00
questions that correspond to each column
00:06:02
name and the results now if any of this
00:06:05
doesn't make sense now then then it will
00:06:07
once we open up this data in Jupiter so
00:06:10
I'm just given a broad overview here
00:06:12
don't let this overwhelm you by
00:06:15
everything that I'm saying here this
00:06:17
will make a lot more sense once we open
00:06:18
this up in Jupiter so let's go ahead and
00:06:21
do that so to open this in a Jupiter
00:06:23
notebook I'm going to go back to my
00:06:26
terminal so I'm going to go ahead and
00:06:27
close these Finder windows open here go
00:06:30
back to my terminal and now within here
00:06:33
I'm going to navigate to my folder where
00:06:35
I place that data and this should be the
00:06:38
same command on Mac
00:06:39
and windows so I'm gonna say CD and I'm
00:06:43
gonna go to my desktop this is going to
00:06:45
be wherever your project directory is
00:06:47
but mine is in this pandas demo on my
00:06:50
desktop and once I am navigated to that
00:06:53
directory to start up a Jupiter notebook
00:06:55
we just need to say Jupiter notebook and
00:06:59
run that and we should see a server
00:07:02
start up here
00:07:03
and it seems like it's taking a second
00:07:05
ok there we go
00:07:06
now back in our terminal here this will
00:07:10
run a Jupiter server and you will need
00:07:13
to leave that terminal open while you're
00:07:15
working in Jupiter so Jupiter rum runs
00:07:18
in the browser so if you shut down this
00:07:20
server then you won't be able to access
00:07:22
our notebook okay so let's go back here
00:07:27
to the browser and this is where we have
00:07:30
our Jupiter notebooks so let me zoom in
00:07:32
here so that we can so that everybody
00:07:34
can read this fairly well okay I'll zoom
00:07:38
in to about right there I think is good
00:07:39
okay so we can see our data folder here
00:07:42
that we downloaded and placed in our
00:07:44
Jupiter demo folder a little bit ago but
00:07:47
now let's create a new notebook so to
00:07:50
create a new notebook I'm going to click
00:07:51
on new up here at the top right and then
00:07:54
I'm going to use Python 3 and now we can
00:07:59
name our notebook so up here where it
00:08:01
says untitled I'm going to click here
00:08:03
and I'm just going to call this pandas
00:08:06
demo and rename that ok so now we're
00:08:09
ready to start using pandas so we can
00:08:12
import this by saying import pandas as
00:08:16
PD now importing pandas as PD is just a
00:08:21
common convention when using pandas so
00:08:23
let's run that and I ran that cell by
00:08:27
pressing Shift + Enter and again I'm not
00:08:30
going to go into the specifics of
00:08:31
working here within Jupiter in this
00:08:33
series but if you'd like a rundown of
00:08:35
the features and shortcuts that I'll be
00:08:37
using then I do have a link to my
00:08:39
Jupiter video in the description section
00:08:41
below ok so for the rest of this video
00:08:43
we'll see how to load in our data and
00:08:46
look at some information about that data
00:08:48
so our data is in a CSV format so in
00:08:53
order to
00:08:53
in that CSV we can simply say DF which
00:08:57
is going to stand for data frame we
00:08:59
learn about all about data frames here
00:09:00
and a bit we're going to say DF is equal
00:09:02
to PD dot read underscore CSV we're
00:09:07
going to use the read CSV method from
00:09:10
pandas here and now we just want to pass
00:09:13
in a path to our CSV file now mine was
00:09:16
within that data folder and that was
00:09:19
within the file survey underscore
00:09:22
results under score public dot CSV so
00:09:26
now if I hit shift enter then that will
00:09:30
run that cell so right off the bat we
00:09:33
can see that this is pretty simple to
00:09:34
work with so when using native Python in
00:09:37
order to read in a CSV file we need to
00:09:40
use the CSV module to create a CSV
00:09:42
reader and things like that but here
00:09:45
we're just doing this all in one line so
00:09:48
when it reads this in it's going to read
00:09:50
it in as a data frame so data frames are
00:09:53
pretty much the backbone of pandas and
00:09:55
we'll go more into what go over data
00:09:58
frames and series objects in depth in
00:10:01
the next video but for the basics a data
00:10:04
frame is basically just rows and columns
00:10:07
of data we can see what a data frame
00:10:09
looks like but just by printing it out
00:10:11
and this is the great thing about using
00:10:13
Jupiter notebooks because it allows us
00:10:15
to visualize these things in ways that
00:10:19
we can't do in other editors so here in
00:10:22
Jupiter I can simply just say DF and run
00:10:25
that and it will print out our data
00:10:29
frame here so we didn't even need to
00:10:31
wrap this here in a print function now
00:10:34
if you're using a normal editor then you
00:10:37
can still print out data frame in from
00:10:39
information but it's not going to look
00:10:42
as good as it does here in Jupiter where
00:10:45
we get this interactive table so this is
00:10:48
a small look at our data now this is
00:10:51
actually 85 columns here but if I scroll
00:10:55
through these then it doesn't look like
00:10:57
there's actually 85 columns printed out
00:11:00
here so this is actually concatenated by
00:11:04
default just to give us a broad overview
00:11:07
of the
00:11:07
data so by default Jupiter is displaying
00:11:10
20 columns from our data frame now how
00:11:14
did I know that there was 85 columns for
00:11:17
this data frame well there are a few
00:11:19
attributes and methods that we can use
00:11:21
to get an idea of what our data looks
00:11:24
like so first we have the shape
00:11:26
attribute and shape gives us the number
00:11:31
of rows and columns in a tuple form so
00:11:35
let's look at this so in our next cell
00:11:37
down here I'm gonna say DF dot shape and
00:11:40
I will run that now this is an attribute
00:11:44
here it's not a method so you don't want
00:11:47
to put parentheses so DF dot shape and
00:11:50
we can see that we have 88 thousand rows
00:11:55
and 85 columns now if you wanted a bit
00:12:00
more information then we can use the
00:12:02
info method the info method will give us
00:12:04
the number of rows and columns and also
00:12:07
all of the data types of all the columns
00:12:09
as well
00:12:10
now before I run that it looks like my
00:12:14
text is getting cut off here a little
00:12:16
bit sometimes this happens whenever I'm
00:12:19
within Jupiter in order to fix this I
00:12:22
usually just come up here and restart
00:12:25
and run all my cells again that usually
00:12:28
takes care of the problem let's see if
00:12:31
that works okay so that seemed to work
00:12:33
another thing that you can do here is
00:12:35
just to totally reload the page and the
00:12:38
browser and when you reload the page I
00:12:41
think it's just because of how my I have
00:12:44
this text enlarged so it's kind of
00:12:47
messing with how these look but now we
00:12:49
can see these just fine
00:12:50
okay so like I was saying we can see
00:12:54
here that we have eighty eight thousand
00:12:55
eight hundred and eighty three rows and
00:12:58
eighty five columns now if you wanted
00:13:01
more information then we can use the
00:13:03
info method and that will give us the
00:13:06
number of rows and the number of columns
00:13:08
but also all of the data types of the
00:13:11
columns so let's run that so if I do D F
00:13:14
dot info whoops
00:13:16
D F dot info now this actually is a
00:13:19
method so we do want to
00:13:21
you put the parentheses there and let me
00:13:24
run this and now let's go over this
00:13:27
output so we can see here that it says
00:13:29
that we have eighty-eight thousand eight
00:13:31
hundred and eighty three entries so
00:13:33
those are our rows we have a total of
00:13:35
eighty five columns and then it lists
00:13:38
all of our columns here for our data so
00:13:40
these are all the columns in our CSV
00:13:43
file that we have loaded in now it also
00:13:46
gives us the data types of each of these
00:13:48
columns and we're going to go over data
00:13:50
types in a future video but for the most
00:13:54
part objects usually mean strings and
00:13:57
then we have other things as well so int
00:14:00
64 is just an integer float is a float
00:14:04
so a probably a decimal number and there
00:14:08
are no other data types in this data set
00:14:12
but there are more data types in general
00:14:14
so I will be sure to do a video on data
00:14:18
types specifically in the near future
00:14:21
okay so now that we know the number of
00:14:23
rows and columns let's change a setting
00:14:26
here within Jupiter so that we can see
00:14:28
all of the columns so I think it would
00:14:31
be useful to see all of these if we'd
00:14:33
like to even if there are a lot of these
00:14:36
to scroll through so to do this we can
00:14:39
at change a setting and I'm gonna come
00:14:41
down here to the bottom here and I'm
00:14:44
gonna change a setting by saying PD dot
00:14:46
set underscore option and within here I
00:14:50
will say display dot max underscore
00:14:55
columns and I will set that equal to 85
00:15:00
so that we can see all of our columns
00:15:02
and I will run that and now if we print
00:15:06
out our data frame so I'm going to go
00:15:08
back up here to where we print it out
00:15:10
this data frame and I will rerun that
00:15:14
cell and now if I scroll through these
00:15:16
columns then we can see that now it
00:15:19
looks like we actually have these 85
00:15:21
different columns here so I can keep
00:15:24
scrolling and keep scrolling and it
00:15:26
didn't just chop us off at that 20 like
00:15:28
it was before
00:15:28
now obviously the rows are also being
00:15:31
concatenated here and we definitely
00:15:33
don't want to print
00:15:34
all 89 thousand of these rows but there
00:15:39
probably are some examples with certain
00:15:41
datasets where you might want to see all
00:15:43
of the rows as well so for example I
00:15:46
said that the survey results schema CSV
00:15:49
file that was included in our download
00:15:52
gives the matching questions for all of
00:15:55
these column names here so if we wanted
00:15:58
to see what these column names here mean
00:16:02
for this data then we can load in that
00:16:04
schema CSV file as well so let me do
00:16:08
this I'll go down to the bottom of our
00:16:10
notebook and I will just load this in by
00:16:13
saying schema underscore D F now I don't
00:16:16
want to just call this D F because we
00:16:18
don't want to overwrite our other data
00:16:20
frame and I will load this in just like
00:16:23
we saw before by saying PD dot read
00:16:25
underscore CSV and this is within the
00:16:29
data folder and this was called survey
00:16:32
underscore results under score schema
00:16:37
CSV so I will run this and now let's
00:16:41
look at this schema data frame that we
00:16:46
just loaded in so here we on this column
00:16:51
column here this gives us all of the
00:16:53
columns in our other data frame so we
00:16:57
have respondent main branch hobbyist and
00:16:59
if I scroll up to that data frame here
00:17:01
I'm gonna delete this info here since we
00:17:04
no longer need that if I scroll up to
00:17:07
this data frame here then we can see
00:17:09
respondent main branch hobbyist so if we
00:17:13
want to know what these mean then that's
00:17:15
what we use the schema for so we can see
00:17:17
that main branch or hobbyist means d-u
00:17:21
code as a hobby main branch means which
00:17:24
of the following options best describes
00:17:25
now it actually concatenates the text
00:17:28
too in order to actually see this to the
00:17:31
full text we could either change an
00:17:34
option or we could just access this
00:17:36
value directly and I will be showing you
00:17:38
how to do that in the next video but for
00:17:41
now we can see that we can't see all of
00:17:44
the rows to the questions that correlate
00:17:48
to each column name here remember we
00:17:50
have 85 columns but for here we can only
00:17:53
see the first five and then we get this
00:17:56
ellipses here and then we can see the
00:17:58
last five so let's set this up so that
00:18:02
we can view 85 rows and then reprint
00:18:06
this so that we can see all of these so
00:18:08
back in the same cell where we set our
00:18:11
max columns now let's also add one four
00:18:17
rows as well so I'm just going to copy
00:18:19
and paste that but instead of max
00:18:21
columns here I'm gonna have this be max
00:18:23
rows and I will run that and now we will
00:18:27
rerun this schema here and now we can
00:18:31
see that we can see all of the columns
00:18:33
and the corresponding question text so
00:18:37
if you wanted to know what any of these
00:18:38
columns mean then this is how we do it
00:18:42
so we can see IT person the question was
00:18:44
are you the IT support person for your
00:18:47
family so that's probably a yes or no
00:18:49
question so that is what those mean so
00:18:52
if you're going through this data on
00:18:54
your own then you can use this as a
00:18:55
reference anytime you don't know what a
00:18:58
certain column means in our survey data
00:19:00
and if you don't know or if you don't
00:19:03
want to look through all of these to
00:19:05
find a specific row or a specific column
00:19:09
name then in a future video we're going
00:19:11
to learn about filtering data frames and
00:19:14
see how we can just grab a specific row
00:19:16
where the column equals a certain value
00:19:19
okay so now we have all 85 rows visible
00:19:23
of our schema data frame here but you
00:19:26
might be thinking well that's nice but I
00:19:29
don't want to see eighty five rows of my
00:19:31
survey data every time I want to look at
00:19:34
it but there are a couple of methods
00:19:36
that we can use to only see a certain
00:19:39
number of rows which you'll most likely
00:19:41
use a lot just to get an idea that your
00:19:44
filters and data frames seem to be
00:19:46
working correctly so we can see the
00:19:49
first five rows by saying instead of
00:19:51
doing a DF here we can say D F dot head
00:19:55
and if I run that then we just get the
00:19:58
first five rows here okay and you can
00:20:01
pass
00:20:02
value if you want to see a certain
00:20:03
number of values so if you wanted to see
00:20:05
the first ten rows then we could pass in
00:20:08
a ten to D F dot head and this gives us
00:20:11
the first ten rows so we can see it goes
00:20:13
all the way down zero through nine there
00:20:16
now if you'd like to see the last rows
00:20:18
instead of the first rows then we can
00:20:20
use the tail method instead
00:20:23
so if we say DF tail and
00:20:26
we could use it without a number also
00:20:28
but if we pass in a number just like
00:20:31
with head then now we're going to say
00:20:32
that we want the last ten entries here
00:20:36
in our data so those are the last ten
00:20:38
items of our data okay so this is a
00:20:41
brief overview of getting pandas
00:20:44
installed and then downloading our data
00:20:47
and loading our data in to Jupiter and
00:20:50
how to read this in now before we end
00:20:54
here I'd like to mention the sponsor of
00:20:56
this video and that is brilliant org so
00:20:59
in this series we've been learning about
00:21:01
pandas and how to analyze data and
00:21:03
python and brilliant would be an
00:21:05
excellent way to supplement what you
00:21:06
learn here with their hands-on courses
00:21:08
they have some excellent courses and
00:21:10
lessons that do a deep dive on how to
00:21:11
think about and analyze data correctly
00:21:13
for data analysis fundamentals I would
00:21:16
really recommend checking out their
00:21:17
statistics course which shows you how to
00:21:19
analyze graphs and determine
00:21:20
significance in the data and I would
00:21:22
also recommend their machine learning
00:21:24
course which takes data analysis to a
00:21:26
new level
00:21:26
well you'll learn about the techniques
00:21:28
being used that allow machines to make
00:21:30
decisions where there's just too many
00:21:32
variables for a human to consider so to
00:21:34
support my channel and learn more about
00:21:36
brilliant you can go to brilliant org
00:21:38
Forge slash CMS to sign up for free and
00:21:40
also the first 200 people they go to
00:21:43
that link will get 20% off the annual
00:21:45
premium subscription and you can find
00:21:47
that link in the description section
00:21:48
below
00:21:49
again that's brilliant org forge slash
00:21:52
CMS
00:21:54
okay so I think that is going to do it
00:21:56
for our first pandas video I hope you
00:21:58
feel like you've got a good introduction
00:21:59
on how to install pandas and load in
00:22:01
your data to a jupiter notebook in the
00:22:03
next video we're going to be learning
00:22:05
more about data frames and also learn
00:22:07
about the series data type so we'll
00:22:10
learn how we can think about data frames
00:22:12
in a way that's easier to understand and
00:22:14
also see how we can
00:22:16
grab certain elements columns and rows
00:22:18
from these as well so be sure to stick
00:22:21
around for that but if anyone has any
00:22:23
questions about will be covered in this
00:22:24
video then feel free to ask in the
00:22:26
comment section below and I'll do my
00:22:27
best to answer those and if you enjoyed
00:22:29
these tutorials and would like to
00:22:30
support them then there are several ways
00:22:32
you can do that the easiest ways to
00:22:34
simply like the video and give it a
00:22:35
thumbs up and also it's a huge help to
00:22:37
share these videos with anyone who you
00:22:38
think would find them useful and if you
00:22:40
have the means you can contribute the
00:22:41
patreon and there's a link to that page
00:22:43
in the description section below
00:22:44
be sure to subscribe for future videos
00:22:46
and thank you all for watching
00:22:58
you

Etiquetas

pandas
Python
data analysis
Jupyter Notebook
data manipulation
CSV
installation
pandas tutorial
data science
Stack Overflow dataset