如何获取Beautiful Soup以从href和class获得链接？

我正在编写脚本以从网站下载多个FLAC,并且正在使用Beautiful Soup获取flac链接并使用urlopen下载链接

我希望BS搜索以.flac结尾的链接(我不知道文件名,只是扩展名EX：1个文件是XXX.flac,另一个是YYY.flac)

flac文件的HTML在这里

<b><a class=location href="/soundtracks/index.php">Soundtracks</a><font class=location> &raquo </font><a href="/soundtracks/highquality/index.php">High Quality Game 
Soundtracks [FLAC]</a><font class=location> &raquo </font><a href="/soundtracks/highquality/Metal_Gear_20th_Anniversary/72">Metal Gear 20th Anniversary</a><font class=location> &raquo 01 Metal Gear 20 Years History -Past, Present, Future- Download</font></b><h1>Metal Gear 20th Anniversary Download Links:</h1><a style="font-size: 16px; font-weight:bold;" href="http://50.7.161.234/bks/94/245/Music/[029] MG 20th Anniversary [FLAC]/01 Metal Gear 20 Years History -Past, Present, Future-.flac">Metal Gear 20th Anniversary - 01 Metal Gear 20 Years History -Past, Present, Future-</a> <font face="Verdana" style="font-size: 16px;">Format: FLAC, Size: 76M</font><br> <font face="Verdana" style="font-size: 10px;"><b>Note: If the file starts playing in your browser window, try right-clicking and "Save Target As"</b></font><br>

我试图找到身份证. t = soup.find(id =“ flac”)但我没有任何相关结果.我对此很空白,我不知道有什么办法解决

如何让BS搜索并找到文件链接,然后将该文件链接分配给变量？

import mechanize
import urllib, urllib2, re
from bs4 import BeautifulSoup
####MECHANIZE####
br = mechanize.Browser()
res = br.open("http://www.emuparadise.me/soundtracks/highquality/Metal_Gear_20th_Anniversary/72")
a = 2 #COUNTER FOR LOOP
br.follow_link(text_regex='Download', nr=a)
b = br.geturl() #GETS THE URL
print b


page = urllib2.urlopen(b).read()
soup = BeautifulSoup(page)
soup.prettify()
t = soup.find(id="")
print t

最佳答案

您的代码正在尝试匹配链接到这些标记的锚标记中不存在的id属性.

而是使用正则表达式来匹配以.flac结尾的href：

t = soup.find_all(href=re.compile(".flac$"))

点击查看更多相关文章

转载注明原文：如何获取Beautiful Soup以从href和class获得链接？ - 乐贴网

JAVA c c++go swift javascript Nginx UI/UE 小程序 Python C#php asp GitHub项目推荐

2024年可用、好用、值得推荐的磁力搜索引擎汇总（长期更新）

每日神器 5年前 208123

23个全网VIP影视剧解析工具（内置接口、有些支持搜索）

每日神器 4年前 5680

软件合集：i酷橘子版、聚合搜片、通用去广告、音乐搜索器

每日神器 3年前 133

ZYPlayer285电脑版：全网影视资源聚合搜索+播放（附福利源导入教程）

每日神器 4年前 3201

滴滴ETA论文解读：WDR模型

机器学习 5年前 120

每个开发人员都应该知道的 10 个 GitHub 仓库

比特币 5年前 65

2020.05.17 迅雷激活码访问频繁最新更新迅雷会员租用

迅雷会员 6年前 57

python-Kivy TextInput如何结合使用hint_text和focus

Python 6年前 38

数据治理中Oracle SQL和存储过程的数据血缘分析

oracle 4年前 83

Google新PR：以链接距离为基础的页面级别

SEO每天一贴 6年前 102

都说 HashMap 是线程不安全的，到底体现在哪儿？

SEO 5年前 40

redis.conf在官方docker镜像中的位置是什么？

Docker 6年前 79

python – 使用支持向量回归的时间序列预测

Python 7年前 107

python – numpy中的三维数组

Python 6年前 1627

如何在Pycharm4中将现有目录转换为python包

Python 6年前 395

python – 短语中每个单词的大写首字母

Python 6年前 121

如何获取Beautiful Soup以从href和class获得链接？

Python 6年前 78

如何在Python中迭代cur.fetchall()

Python 6年前 155

将多个Excel工作表合并到一个Pandas数据帧中

Python 6年前 59

python – 在sklearn中使用RandomForestClassifier进行不平衡分类

Python 6年前 197

python – 具有颜色渐变的Matplotlib 3D散点图

Python 6年前 565

Python – Xlwt超过256列

Python 6年前 248

将curl转换为python请求

Python 6年前 333

如何访问typing.Generic的类型参数？

Python 6年前 121

热门推荐

相关推荐