Membership Constraints Flashcards
What is the set()?
are used to store multiple items in a single variable.
stores its unique values
What is Anti-Join?
take in two DataFrames A and B, and return data from one DataFrame that is not contained in another. we are performing a left anti join of A and B, and are returning the columns of DataFrames A and B for values only found in A of the common column between them being joined on.
What is Inner Join?
return only the data that is contained in both DataFrames. For example, an inner join of A and B, would return columns from both DataFrames for values only found in A and B, of the common column between them being joined on.
How do you find inconsistent categories?
You do set to find the unique values
inconsistent_catergories= set(dataframe[column]).difference(categories[column])
What is the difference()?
the method returns a set that contains the difference between two sets.
How do you find inconsistent rows?
inconsistent_rows= dataframe.[column].isin(inconsistent_categories)
subset data:
inconsistent_data= dateframe[inconsistent_rows]
What is the isin() method?
a method is used to filter data frames. isin() method helps in selecting rows with having a particular(or Multiple) value in a particular column
How do you drop inconsistent rows?
consistent_data= dataframe[~inconsistent_rows]
What is member constraint?
a membership constraint is when a categorical column has values that are not in the predefined set of categories of your column.