2023年9月26日 星期二

9/26 每日一題 pandas(找出地N大的薪水)

Table: Employee

+-------------+------+
| Column Name | Type |
+-------------+------+
| id          | int  |
| salary      | int  |
+-------------+------+
id is the primary key (column with unique values) for this table.
Each row of this table contains information about the salary of an employee.

 

Write a solution to find the nth highest salary from the Employee table. If there is no nth highest salary, return null.

The result format is in the following example.

 

Example 1:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+
n = 2
Output: 
+------------------------+
| getNthHighestSalary(2) |
+------------------------+
| 200                    |
+------------------------+

Example 2:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
+----+--------+
n = 2
Output: 
+------------------------+
| getNthHighestSalary(2) |
+------------------------+
| null                   |
+------------------------+





import pandas as pd

def nth_highest_salary(employee: pd.DataFrame, N: int) -> pd.DataFrame:
    res=set()
    n=employee['salary'].nunique()
    print(f'有{n}組')
    clo_name=f'getNthHighestSalary({N})'
    if N > n:
        print('沒戲')
        return pd.DataFrame({clo_name:[None]})
    else:
        for ind,i in employee.iterrows():
            row=i['salary']
            print(row)
            res.add(row)
        res=list(res)
        res.sort(reverse=True)
        print(f'res={res}')
        df=pd.DataFrame({clo_name:res[N-1]},index=[0])
        print(df)
        return df
--------------------------------------------

import pandas as pd

def nth_highest_salary(employee: pd.DataFrame, N: int) -> pd.DataFrame:
    unique_salaries = employee.salary.unique()
    if len(unique_salaries) < N:
        return pd.DataFrame([np.NaN], columns=[f'getMthHighestSalary({N})'])
    else:
        salary = sorted(unique_salaries, reverse=True)[N-1]
        return pd.DataFrame([salary], columns=[f'getMthHighestSalary({N})'])



--------------參考

方法

  • 刪除任何重複的工資值:在「工資」列上使用 drop_duplicates() 函數刪除任何重複的工資值(如果存在)。此步驟對於避免將重複項計為單獨的薪資等級至關重要。

  • 將唯一工資進行排序:對唯一工資值使用 sort_values() 函數以降序對它們進行排序。

  • 檢查 N 是否在有效範圍內:驗證所需的第 N 個最高工資是否在 DataFrame 中可用的唯一工資值的範圍內。如果 N 超過唯一工資的數量,則沒有第 N 個最高工資,我們返回「無」。

  • 取得第 N 個最高工資:使用 iloc[N - 1] 從排序後的工資中取得第 N 個最高工資。N 減 1,因為 Python 使用從零開始的索引

程式碼



import pandas as pd

def nth_highest_salary(employee: pd.DataFrame, N: int) -> pd.DataFrame:

    # Drop any duplicate salary values to avoid counting duplicates as separate salary ranks
    unique_salaries = employee['salary'].drop_duplicates()

    # Sort the unique salaries in descending order and get the Nth highest salary
    sorted_salaries = unique_salaries.sort_values(ascending=False)

    # If N exceeds the number of unique salaries, return None
    if N > len(sorted_salaries):
        return pd.DataFrame({'Nth Highest Salary': [None]})
    
    # Get the Nth highest salary from the sorted salaries
    nth_highest = sorted_salaries.iloc[N - 1]
    
    return pd.DataFrame({'Nth Highest Salary': [nth_highest]})

------------

標籤:

0 個意見:

張貼留言

訂閱 張貼留言 [Atom]

<< 首頁