2023年9月25日 星期一

9/24 每日一題(MYSQL 計算每個教授教授唯一科目的數量)

Table: Teacher

+-------------+------+
| Column Name | Type |
+-------------+------+
| teacher_id  | int  |
| subject_id  | int  |
| dept_id     | int  |
+-------------+------+
(subject_id, dept_id) is the primary key (combinations of columns with unique values) of this table.
Each row in this table indicates that the teacher with teacher_id teaches the subject subject_id in the department dept_id.

 

Write a solution to calculate the number of unique subjects each teacher teaches in the university.

Return the result table in any order.

The result format is shown in the following example.

 

Example 1:

Input: 
Teacher table:
+------------+------------+---------+
| teacher_id | subject_id | dept_id |
+------------+------------+---------+
| 1          | 2          | 3       |
| 1          | 2          | 4       |
| 1          | 3          | 3       |
| 2          | 1          | 1       |
| 2          | 2          | 1       |
| 2          | 3          | 1       |
| 2          | 4          | 1       |
+------------+------------+---------+
Output:  
+------------+-----+
| teacher_id | cnt |
+------------+-----+
| 1          | 2   |
| 2          | 4   |
+------------+-----+
Explanation: 
Teacher 1:
  - They teach subject 2 in departments 3 and 4.
  - They teach subject 3 in department 3.
Teacher 2:
  - They teach subject 1 in department 1.
  - They teach subject 2 in department 1.
  - They teach subject 3 in department 1.
  - They teach subject 4 in department 1.

 # Write your MySQL query statement below

select teacher_id ,count(distinct subject_id) as cnt
from Teacher
group by teacher_id

------------------------



import pandas as pd

def count_unique_subjects(teacher: pd.DataFrame) -> pd.DataFrame:
    res = teacher.groupby('teacher_id')[['subject_id']].nunique().reset_index()
    res = res.rename(columns={'subject_id':'cnt'})
    return res


-------#其中使用了函數nunique() 計算不重複的值
以及使用了groupby()進行分組, 所以會以分組的依據成為新的索引值, 於是使用的reset_index()洗掉索引
最後在使用df.rename(columns={old_name:new_name})

--------------------參考
import pandas as pd

def count_unique_subjects(teacher: pd.DataFrame) -> pd.DataFrame:
    g = teacher.groupby('teacher_id')
    results = g['subject_id'].nunique()
    return pd.DataFrame({'teacher_id':g.groups.keys(),'cnt':results})

    #result=teacher.groupby('teacher_id')['subject_id'].nunique().reset_index()
    #return result.rename({'subject_id':'cnt'},axis=1)

#將按照teacher_id分組的結果賦值給g
#g['subject_id'].nunique() 這一行計算每個tracher_id分組中不重複的subject_id數量
#g.groups.keys() 這一部分返回分組的唯一teacher_id值

------------------------

import pandas as pd

def count_unique_subjects(teacher: pd.DataFrame) -> pd.DataFrame:
    df = teacher.drop_duplicates(["teacher_id", "subject_id"])
    df = df["teacher_id"].value_counts().reset_index()
    df.columns = ["teacher_id", "cnt"]
    return df


#觀念
#1.drop_duplicates()這個函數用來刪除指定columns中的重複rows, 這裡指定的teacher_id, subject_id
#2.然後再使用value_counts() 計算teacher_id出現的次數
#3.df.columns 這一行將teacher_id更名為cnt




標籤:

0 個意見:

張貼留言

訂閱 張貼留言 [Atom]

<< 首頁