kaggle learning log_1

Foreword

As a student major in data science, I planned to learn machine learning at the begin of the semester1 of the 2023/2024 acadamic year. Kaggle is a platform which holds a lot of Data science competitons and it is good for me to study machine learning or something else about data science, and from now on, I will write a series of logs to record daily process.

Day 1

First dayt I learnt the basic knowledge of python and panda, which contain the basic operation and grammar of python and panda. Because in the last term, I had learnt these two tools through the course Comp1012 and AMA1600,so nothing strange or new for me, the only thing which worth recording is the difference of ‘.loc’ and ‘.iloc’
Actually, if you just wanna selct a specific row these two statement are not differnet just

review.loc[0]
review.iloc[0]

The two statements above are used to select the first row which index is 0,but if you wanna to select the specific column, you will find the difference of these two statements

for example, if I wanna select the first column named “fang”
review.loc[:,[‘fang’]]
review.iloc[:,0]

The iloc just needs the index of the column,but the loc needs the name of the column.

Besides,today also review a liitle bit linear algebra, here,I recommand Professor Gillbert Strang’s book Introduction to Linear Algebra.Through this book, you can easliy know the basic theory of linear algebra.
The chapter one is a introduction to vector,there are something interesting about the operation between different vector

Dot products:
v1 and v2 are the same dimension vector v1=(x1,y1),v2=(x2,y2)v_{1}=(x_{1},y_{1}),v_{2}=(x_{2},y_{2})
the dot product of v1,v2 is: v1.v2=x1x2+y1y2v_{1}.v_{2}=x_{1}x_{2}+y_{1}y_{2}
and if there is a matirx times a vector we can use the dot product to calculate it

and from the defination we know that $v_{ }.w_{}=|v||w|cos\theta $ and $ cos\theta \subset[-1,1]$so

we can get two inequalities:
Schwarz inquiality: vwvw|v\cdot w|\le |v||w|
Triangle inequaility: v+wv+w|v+w|\le |v|+|w|
and according to Schwarz inequality we can draw that ‘geometric mean’<‘arithmetic mean’
it easily to proof 12(x+y)xy\frac{1}{2}(x+y)\ge \sqrt{xy}

but how to proof if the vector is high dimensional?

let v1=(x,y,z),v2=(y,z,x),v3=(z,x,y)v_{1}=(x,y,z),v_{2}=(y,z,x),v_{3}=(z,x,y)
according to the Schwarz inequality ,we can conclude that 13(x+y+z)xyz3\frac{1}{3}(x+y+z)\ge \sqrt[3]{xyz }
higher dimension are the same