PHP采集链接:相对链接转为绝对链接

该采集链接是从Snoopy中提取出来的,也是一个很好的函数,可以根据URL是相对链接还是绝对链接采集到链接,如果是相对链接会根据相对链接和主域名,返回绝对链接,也支持不同端口。

<?php
/*===================================================================*
	Function:	_expandlinks
	Purpose:	expand each link into a fully qualified URL
	Input:		$links			the links to qualify
				$URI			the full URI to get the base from
	Output:		$expandedLinks	the expanded links
*===================================================================*/
function _expandlinks($links,$URI)
{
	$URI_PARTS = parse_url($URI);
	$host = $URI_PARTS["host"];
	preg_match("/^[^?]+/",$URI,$match);
	$match = preg_replace("|/[^/.]+.[^/.]+$|","",$match[0]);
	$match = preg_replace("|/$|","",$match);
	$match_part = parse_url($match);
	$match_root =
	$match_part["scheme"]."://".$match_part["host"];
	$search = array( 	"|^http://".preg_quote($host)."|i",
						"|^(/)|i",
						"|^(?!http://)(?!mailto:)|i",
						"|/./|",
						"|/[^/]+/../|"
					);
	$replace = array(	"",
						$match_root."/",
						$match."/",
						"/",
						"/"
					);
	$expandedLinks = preg_replace($search,$replace,$links);
	return $expandedLinks;
}
//以下是测试内容
$r = _expandlinks('asd/asd.html','http://www.361way.com/');
echo $r;
//output http://www.361way.com/asd/asd.html
echo '<br />';
$r = _expandlinks('http://www.361way.com/asd.html','http://www.361way.com/');
echo $r;
//output http://www.361way.com/asd.html
echo '<br />';
$r = _expandlinks('asd.html','http://www.361way.com:8080/');
echo $r;
//output http://www.361way.com:8080/asd.html
?>

经过测试,可以知道:第一个参数$links是链接的url
比较你采到网站中链接是<a href="asd.html">测试</a> 

主站域名是http://www.test.com/ 此函数会根据相对路径关系,反回绝对路径http://www.test.com/asd.html

PHP采集链接:相对链接转为绝对链接》有1条评论

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注