采集農(nóng)產(chǎn)品每日價(jià)格數(shù)據(jù)
一、爬取網(wǎng)站
http://www./priceDetail.html
二、爬取目標(biāo)
采集2022年12月份豬肉的價(jià)格數(shù)據(jù)。
目標(biāo)地址
- pubDateStartTime: 開(kāi)始時(shí)間
- pubDateEndTime: 終止時(shí)間
一級(jí)分類:
LOne = {"蔬菜": 1186, "水果": 1187, "肉禽蛋": 1189,
"水產(chǎn)": 1190, "糧油": 1188, "豆制品": 1203,
"調(diào)料": 1204}
二級(jí)分類:
LTwo = {"水菜": 1199, "特菜": 1200,
"進(jìn)口果": 1201, "干果": 1202,
"豬肉類": 1205, "牛肉類": 1206, "羊肉類": 1207, "禽蛋類": 1208,
"淡水魚(yú)": 1209, "海水魚(yú)": 1210, "蝦蟹類": 1217, "貝殼類": 1218, "其他類": 1211,
"米面類": 1212, "雜糧類": 1213, "食用油": 1214}
目標(biāo)數(shù)據(jù)
- Content-Type: application/json;charset=UTF-8
三、爬取代碼
import requests
import csv
import time
LOne = {"蔬菜": 1186, "水果": 1187, "肉禽蛋": 1189,
"水產(chǎn)": 1190, "糧油": 1188, "豆制品": 1203,
"調(diào)料": 1204}
LTwo = {"水菜": 1199, "特菜": 1200,
"進(jìn)口果": 1201, "干果": 1202,
"豬肉類": 1205, "牛肉類": 1206, "羊肉類": 1207, "禽蛋類": 1208,
"淡水魚(yú)": 1209, "海水魚(yú)": 1210, "蝦蟹類": 1217, "貝殼類": 1218, "其他類": 1211,
"米面類": 1212, "雜糧類": 1213, "食用油": 1214}
url = "http://www./getPriceData.html"
data = {
"limit": 200,
"current": 1,
"pubDateStartTime": "2022/12/01",
"pubDateEndTime": "2022/12/31",
"prodPcatid": LOne["肉禽蛋"],
"prodCatid": LTwo["豬肉類"],
"prodName": ""
}
with open(r'.\豬肉報(bào)價(jià).csv', mode='w+', newline='', encoding='utf-8') as f:
csv_writer = csv.writer(f)
csv_writer.writerow(["一級(jí)分類", "二級(jí)分類", "品名", "最低價(jià)", "平均價(jià)", "最高價(jià)",
"規(guī)格", "產(chǎn)地", "單位", "發(fā)布日期"])
response = requests.post(url, data)
json_data = response.json()
count = json_data['count']
limit = json_data['limit']
n = count // limit + 1
for i in range(1, n + 1):
time.sleep(1)
data['current'] = i
response = requests.post(url, data)
json_data = response.json()['list']
for e in json_data:
e1 = e['prodCat'] # "一級(jí)分類"
e2 = e['prodPcat'] # "二級(jí)分類"
e3 = e['prodName'] # "品名"
e4 = e['lowPrice'] # "最低價(jià)"
e5 = e['avgPrice'] # "平均價(jià)"
e6 = e['highPrice'] # "最高價(jià)"
e7 = e['specInfo'] # "規(guī)格"
e8 = e['place'] # "產(chǎn)地"
e9 = e['unitInfo'] # "單位"
e10 = e['pubDate'].split(' ')[0] # "發(fā)布日期"
t = [e1, e2, e3, e4, e5, e6, e7, e8, e9, e10]
print(t)
csv_writer.writerow(t)
四、爬取結(jié)果