In a previous post, I talked about the loss of links on the web. In Brian Suda’s research on his Pinboard links, he found that 22% of the links were gone. I have links on my website dating back to early 2006, so I was curious how many would still be working.
So I have done a similar analysis on the links I have posted and found out that almost a third of the links are now broken. Some because the domain is no longer available, and some because the website is returning a “not found” (404) response. It’s not perfect, but as Brian writes;
If the HTTP code is less than 400 we mark it as a success. Without manually checking every URL, there might be some false positives: people selling existing domains, hosting provider redirects, etc. If the status code was 400 or higher, we marked it as a failure.
These are the results I found. Out of the 1546 links in my link collection, there were 507 which returned an error. This is a total of 32.8% of the links that are broken and over 40% of those I added in 2006.
Year | Links | Broken | Percentage |
---|---|---|---|
2006 | 760 | 325 | 42.4% |
2007 | 156 | 49 | 31.4% |
2008 | 50 | 19 | 38% |
2009 | 96 | 26 | 27.1% |
2010 | 159 | 62 | 39% |
2011 | 102 | 25 | 24.5% |
2022 | 199 | 1 | 0.5% |
2023 | 24 | 0 | 0% |
TOTAL | 1546 | 507 | 32.8% |
To investigate this, I built two Laravel Artisan commands. One to find broken links and another to generate the report above.
Finding Broken Links
I followed an approach very similar to Brian’s; making an HTTP request to the URL and checking if it is not successful. I can run the command to just check for broken links or additionally mark them in the database by adding an --update
flag;
php artisan link:error-checking
php artisan link:error-checking --update
Below is the command code (sans the namespaces);
class ErrorChecking extends Command
{
use Conditionable;
/** @var string */
protected $signature = 'link:error-checking {--update}';
/** @var string */
protected $description = 'Check links for HTTP errors and optionally mark them as broken.';
protected array $httpOptions = [
'verify' => false,
];
public function handle(): mixed
{
$errorLinks = Collection::make();
$update = $this->option('update');
$links = $this->withProgressBar(Link::all(), function (Link $link) use ($errorLinks) {
$error = $this->isLinkAnError($link);
$errorLinks->when($error, fn () => $errorLinks->add($link));
});
$this->newLine();
$errorLinks->each(function (Link $link) use ($update): void {
$this->error($link->permalink);
$this->when($update, fn () => $link->markAsBroken());
});
$this->info($this->message($links->count(), $errorLinks->count()));
return Command::SUCCESS;
}
protected function isLinkAnError(Link $link): bool
{
try {
return ! Http::withOptions($this->httpOptions)->get($link->permalink)->successful();
} catch (ConnectionException | RequestException) {
return true;
}
}
protected function message(int $total, int $errorCount): string
{
$link = Str::plural('link', $total);
$error = Str::plural('error', $errorCount);
$were = $errorCount === 1 ? 'was' : 'were';
return "{$total} {$link} checked. There {$were} {$errorCount} {$error}.";
}
}
Broken Link Report
For the report, I can use my local database of the updated data. Running link:error-report
generates a table;
+-------+-------+--------+------------+
| Year | Links | Broken | Percentage |
+-------+-------+--------+------------+
| 2006 | 760 | 325 | 42.8% |
| 2007 | 156 | 49 | 31.4% |
| 2008 | 50 | 19 | 38% |
| 2009 | 96 | 26 | 27.1% |
| 2010 | 159 | 62 | 39% |
| 2011 | 102 | 25 | 24.5% |
| 2022 | 199 | 1 | 0.5% |
| 2023 | 24 | 0 | 0% |
| Total | 1546 | 507 | 32.8% |
+-------+-------+--------+------------+
I can change the style of the table using the --style
flag, based on the styles provided by Laravel. These could be one of the following; default, borderless, compact, symfony-style-guide, box or box-double. Unfortunately, there isn't currently a Markdown syntax style provided by the underlying Symfony component.
Below is the code for the report command;
class ErrorReport extends Command
{
use Conditionable;
/** @var string */
protected $signature = 'link:error-report {--style=default}';
/** @var string */
protected $description = 'Output a report about broken links.';
protected array $tableHeaders = [
'Year',
'Links',
'Broken',
'Percentage',
];
public function handle(): mixed
{
$errorLinks = Link::query()
->selectRaw('YEAR(added) as `Year`')
->selectRaw('count(*) as `Total`')
->selectRaw('(SELECT count(*) FROM `links` as l WHERE l.broken = 1 AND YEAR(l.added) = YEAR(links.added)) as `Broken`')
->groupBy('Year')
->orderBy('Year')
->get()
->map(function (Link $link): array {
return array_merge($link->toArray(), [
'Percentage' => $this->percentage($link->Broken, $link->Total),
]);
});
$this->table(
$this->tableHeaders,
$errorLinks->add([
'Total',
$errorLinks->sum('Total'),
$errorLinks->sum('Broken'),
$this->percentage($errorLinks->sum('Broken'), $errorLinks->sum('Total'))
]),
$this->option('style')
);
return Command::SUCCESS;
}
protected function percentage(int $broken, int $total): string
{
$percentage = round(($broken / $total) * 100, 1);
return "{$percentage}%";
}
}