嘗試
進入狄卡西斯版
我們發現頁面必須要點擊確認18歲後才可以進入
這邊我們可以使用自動化selenium來模擬使用者點擊定位元素的位置
我們可以發現
他在有
class='atm_26_4t66r8 atm_1qrujw7_1x4eueo atm_9s_116y0ak atm_h_1h6ojuz atm_fc_1h6ojuz atm_1s_glywfm atm_uc_q8loe9 atm_mk_h2mmj6 atm_kd_glywfm atm_vb_glywfm atm_3f_glywfm atm_l8_1jvvdbw atm_rd_glywfm atm_cs_bfngof atm_9j_tlke0l atm_18yqj6q_13gfvf7 atm_1ksgpba_1skhajo atm_9i962p_f6fqlb atm_1f62j80_exct8b atm_c8_187sfk0 atm_5j_19bvopo atm_g3_qslrf5 atm_1ny5zik_olvwno atm_nluod_1v7wvc0 atm_1rdjdmm_ucc2wb atm_7l_1a11ub3 atm_1vv33dc_v2fha3 atm_1gqaixb_oumlfv atm_1pl68g0_1v7wvc0 atm_1765c25_1q4968j atm_wzxrn8_1ez0meh atm_1bnvlz8_1debaa8'
的button標籤中
由於它包含太多class屬性質,而使用selenium的driver.find_elements()可方法時,可以透By.CLASS_NAME來定位元素的單一class屬性,但無法同時指定多個class屬性
,所以當多個class屬性的元素時我們改採用XPath 或CSS選擇器
XPATH
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
import time
#設定驅動
driver=webdriver.Chrome()
#指定url
driver.get('https://www.dcard.tw/f/sex')
#等待頁面家載
time.sleep(0.5)
#使用XPATH
cl_18=driver.find_element(By.XPATH,'/html/body/div[2]/div[2]/div/div/div/button[2]')
cl_18.click()#點擊該元素(已滿18歲)
CSS
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
import time
#設定驅動
driver=webdriver.Chrome()
#指定url
driver.get('https://www.dcard.tw/f/sex')
#等待頁面家載
time.sleep(0.5)
#使用CSS
cl_18=driver.find_element(By.CSS_SELECTOR,'body > div.__portal > div.atm_piekax_1nc58l7.atm_mk_1n9t6rb.atm_tk_idpfg4.atm_fq_idpfg4.atm_vy_1osqo2v.atm_e2_1osqo2v.atm_wq_18b4za2.atm_ks_b4ywaf.atm_kd_glywfm.s13agepl.overlay-enter-done > div > div > div > button.atm_26_4t66r8.atm_1qrujw7_1x4eueo.atm_9s_116y0ak.atm_h_1h6ojuz.atm_fc_1h6ojuz.atm_1s_glywfm.atm_uc_q8loe9.atm_mk_h2mmj6.atm_kd_glywfm.atm_vb_glywfm.atm_3f_glywfm.atm_l8_1jvvdbw.atm_rd_glywfm.atm_cs_bfngof.atm_9j_tlke0l.atm_18yqj6q_13gfvf7.atm_1ksgpba_1skhajo.atm_9i962p_f6fqlb.atm_1f62j80_exct8b.atm_c8_187sfk0.atm_5j_19bvopo.atm_g3_qslrf5.atm_1ny5zik_olvwno.atm_nluod_1v7wvc0.atm_1rdjdmm_ucc2wb.atm_7l_1a11ub3.atm_1vv33dc_v2fha3.atm_1gqaixb_oumlfv.atm_1pl68g0_1v7wvc0.atm_1765c25_1q4968j.atm_wzxrn8_1ez0meh.atm_1bnvlz8_1debaa8')
cl_18.click()#點擊已滿18
以上兩種方式都可以順利進入頁面
標籤: 爬蟲
0 個意見:
張貼留言
訂閱 張貼留言 [Atom]
<< 首頁