2023年9月22日 星期五

9/21 每日一題(MYSQL 查詢請求成功率為多少)

Table: Signups

+----------------+----------+
| Column Name    | Type     |
+----------------+----------+
| user_id        | int      |
| time_stamp     | datetime |
+----------------+----------+
user_id is the column of unique values for this table.
Each row contains information about the signup time for the user with ID user_id.

 

Table: Confirmations

+----------------+----------+
| Column Name    | Type     |
+----------------+----------+
| user_id        | int      |
| time_stamp     | datetime |
| action         | ENUM     |
+----------------+----------+
(user_id, time_stamp) is the primary key (combination of columns with unique values) for this table.
user_id is a foreign key (reference column) to the Signups table.
action is an ENUM (category) of the type ('confirmed', 'timeout')
Each row of this table indicates that the user with ID user_id requested a confirmation message at time_stamp and that confirmation message was either confirmed ('confirmed') or expired without confirming ('timeout').

 

The confirmation rate of a user is the number of 'confirmed' messages divided by the total number of requested confirmation messages. The confirmation rate of a user that did not request any confirmation messages is 0. Round the confirmation rate to two decimal places.

Write a solution to find the confirmation rate of each user.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Signups table:
+---------+---------------------+
| user_id | time_stamp          |
+---------+---------------------+
| 3       | 2020-03-21 10:16:13 |
| 7       | 2020-01-04 13:57:59 |
| 2       | 2020-07-29 23:09:44 |
| 6       | 2020-12-09 10:39:37 |
+---------+---------------------+
Confirmations table:
+---------+---------------------+-----------+
| user_id | time_stamp          | action    |
+---------+---------------------+-----------+
| 3       | 2021-01-06 03:30:46 | timeout   |
| 3       | 2021-07-14 14:00:00 | timeout   |
| 7       | 2021-06-12 11:57:29 | confirmed |
| 7       | 2021-06-13 12:58:28 | confirmed |
| 7       | 2021-06-14 13:59:27 | confirmed |
| 2       | 2021-01-22 00:00:00 | confirmed |
| 2       | 2021-02-28 23:59:59 | timeout   |
+---------+---------------------+-----------+
Output: 
+---------+-------------------+
| user_id | confirmation_rate |
+---------+-------------------+
| 6       | 0.00              |
| 3       | 0.00              |
| 7       | 1.00              |
| 2       | 0.50              |
+---------+-------------------+
Explanation: 
User 6 did not request any confirmation messages. The confirmation rate is 0.
User 3 made 2 requests and both timed out. The confirmation rate is 0.
User 7 made 3 requests and all were confirmed. The confirmation rate is 1. 

User 2 made 2 requests where one was confirmed and the other timed out. The confirmation rate is 1 / 2 = 0.5. 

# Write your MySQL query statement below
select s.user_id ,
ifnull(sum(action='confirmed')/count(c.action),0) as confirmation_rate

from Signups s

left join Confirmations c on s.user_id =c.user_id

group by user_id


--------------------------
select 
    s.user_id,
    round((sum(case when c.action='confirmed' then 1 else 0 end)/count(*)),2)  as confirmation_rate
from signups s 
left join Confirmations c
on s.user_id=c.user_id
group by s.user_id

----------------
# Write your MySQL query statement below
SELECT Signups.user_id, ROUND(AVG(IF(Confirmations.action='confirmed',1,0)),2) as "confirmation_rate" 
FROM Signups
LEFT JOIN Confirmations on Confirmations.user_id = Signups.user_id
GROUP BY Signups.user_id



-------------------
 
import pandas as pd



def confirmation_rate(signups: pd.DataFrame, confirmations: pd.DataFrame) -> pd.DataFrame:
  res_dict={}
  res_toal={}
  for i,row in signups.iterrows():
    res_dict[row['user_id']]=0
    res_toal[row['user_id']]=0
 
  print(f'建立的key={res_dict}')
  for i,row in confirmations.iterrows():
    if row['action'] == 'confirmed':
      res_dict[row['user_id']] += 1
    res_toal[row['user_id']] +=1
   
  print(f'確認次數:{res_dict}')
  print(f'確認次數:{res_toal}')

  '確認率'
  rate=[]
  for i,j in zip(res_dict.values(),res_toal.values()):
    print(f'i={i}')
    print(f'j={j}')
    if j !=0:
      data=i/j
    else:
      data=0
    rate.append(round(data,2))
  print(rate)
  res_df=pd.DataFrame({
    'user_id':list(res_toal),
    'confirmation_rate':rate
  })
  return res_df
   


--------------------------


import pandas as pd

def confirmation_rate(signups: pd.DataFrame, confirmations: pd.DataFrame) -> pd.DataFrame:
    # Perform a left join on user_id
    merged_df = signups.merge(confirmations, on='user_id', how='left')


    # Calculate confirmation rate for each user_id
    merged_df['confirmation_rate'] = merged_df.apply(lambda row: 1.0 if row['action'] == 'confirmed' else 0.0, axis=1)

    # Group by user_id and calculate average confirmation rate
    confirmation_rate_df = merged_df.groupby('user_id')['confirmation_rate'].mean().reset_index()

    # Round the confirmation_rate to 2 decimal places
    confirmation_rate_df['confirmation_rate'] = confirmation_rate_df['confirmation_rate'].round(2)

    return confirmation_rate_df
import pandas as pd

s = pd.DataFrame({
    'user_id': [3, 7, 2, 6]
})
print(s)

c = pd.DataFrame({
    'user_id': [3, 3, 7, 7, 7, 2, 2],
    'action': ['timeout', 'timeout', 'confirmed', 'confirmed',
    'confirmed', 'confirmed', 'timeout']})
merged = s.merge(c, how='left', on='user_id')
print(merged)
print('='*30)

merged['confirmation_rate'] = merged.apply(
    lambda row: 1.0 if row['action'] == 'confirmed' else 0.0, axis=1)

print(merged)

# confirmation_rate_df =merged.groupby('user_id')['confirmation_rate'].mean()
print(merged.groupby('user_id')['confirmation_rate'].mean().reset_index())
# 使用user_id進行分組 在使用mean()計算平均, 最後使用reset_index()重設索引

標籤:

0 個意見:

張貼留言

訂閱 張貼留言 [Atom]

<< 首頁