#2882
Drop Duplicate Rows
EasyHash MapArray
Approaches
Brute ForceOptimal
Complexity Comparison
| Brute Force | Optimal Solution★ | |
|---|---|---|
| Time | O(n²) | O(n) |
| Space | O(1) | O(n) |
💡
Intuition
Time O(n)Space O(n)
The optimal approach leverages built-in functions in pandas to efficiently drop duplicates based on the email column. This is much faster as it uses optimized algorithms under the hood.
⚙️
Algorithm
4 steps- 1Step 1: Use the pandas 'drop_duplicates' method on the DataFrame.
- 2Step 2: Specify the 'email' column to check for duplicates.
- 3Step 3: Set 'keep' parameter to 'first' to retain the first occurrence.
- 4Step 4: Return the modified DataFrame.
solution.py5 lines
1# Full working Python code
2import pandas as pd
3
4def drop_duplicates_optimal(df):
5 return df.drop_duplicates(subset='email', keep='first')ℹ
Complexity note: The time complexity is O(n) because we are processing each row only once. The space complexity is O(n) due to storing unique emails in a set or map.
- 1Using built-in functions can significantly reduce complexity.
- 2Understanding the data structure helps in choosing the right approach.
Solutions and explanations are original Tejav content. Problem titles © LeetCode — use the LeetCode button above for the full problem statement.