Use PHP to Get the Links from an HTML Page
Gets the links from an html page. In the example, $links is an object, so the links it contains need to be placed in an array to access them. The $linksArray is initialized and each link is added to the array in a foreach loop.
$linksArray = array();
$page = new domDocument;
$page->loadHTML(file_get_contents("http://lage.us/PHP-Get-Links-From-Page.html"));
$page->preserveWhiteSpace = false;
$links = $page->getElementsByTagName('a');
if ($links->length > 0) {
foreach ($links as $link) {
$linksArray[] = $link->getAttribute('href');
}
}
Example:
PHP
$linksArray = array();
$page = new domDocument;
$page->loadHTML(file_get_contents("http://lage.us/PHP-Get-Links-From-Page.html"));
$page->preserveWhiteSpace = false; // do remove redundant white space
$links = $page->getElementsByTagName('a');
if ($links->length > 0) {
foreach ($links as $link) {
$linksArray[] = $link->getAttribute('href');
}
}
print "
"; print_r($linksArray); print "
";
"; print_r($linksArray); print "
Produces the result:
Array
(
[0] => /
[1] => php.html
[3] => html.html
[4] => javascript.html
[4] => css.html
[5] => PHP-load-CSV-into-2d-array.html
[6] => PHP-Convert-2d-Array-to-CSV.html
[7] => PHP-CSV-to-Array.html
[8] => PHP-Insert-Element-Into-Array.html
[9] => PHP-Remove-Last-Character-From-String.html
[10] => PHP-Round-2d-Array-By-Key.html
[11] => PHP-String-Contains-Substring.html
[12] => PHP-Get-Contents-of-Directory.html
[13] => PHP-Script-Time-to-Execute.html
[14] => PHP-Loop-for-Period-of-Time.html
[15] => PHP-Looping-Structures.html
[16] => PHP-Get-Links-From-Page.html
[17] => http://www.indoorclimbing.com/
[18] => http://www.ziplinerider.com/
[19] => http://antiqueable.com/
[20] => http://escaperoomplayer.com/
[21] => http://trampoline.jumpcenters.com/
[22] => http://inflatable.jumpcenters.com/
[23] => disclaimer.html
[24] => privacy-policy.html
[25] => terms-of-use.html
)