FeedManager
Central manager class for feed operations and orchestration
- final class FeedManager
The FeedManager class is the main entry point for HuntFeed. It provides methods for registering feeds, checking updates, exporting data, and handling events.
- FeedFetcher
- FeedScheduler
- FeedCollection
- FeedExporter
use Hosseinhunta\Huntfeed\Hub\FeedManager;
// Initialize FeedManager
$manager = new FeedManager();
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$fetcher |
FeedFetcher | private | Feed fetcher instance |
$scheduler |
FeedScheduler | private | Feed scheduler instance |
$collection |
FeedCollection | private | Feed collection instance |
$eventHandlers |
array | private | Registered event handlers |
$config |
array | private | Configuration settings |
Constructor
__construct(FeedFetcher $fetcher = null, FeedScheduler $scheduler = null, FeedCollection $collection = null)
Methods
Register a single feed for monitoring
Register multiple feeds at once
Check all feeds for new items
Export feeds in various formats
Get all items from all feeds
Register event handler
public
registerFeed()
registerFeed(string $feedId, string $url, array $options = []): self
Registers a single feed for monitoring. The feed will be periodically checked for updates based on the configured interval.
Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
$feedId |
string | Unique identifier for the feed | Required |
$url |
string | URL of the RSS/Atom/JSON feed | Required |
$options |
array | Configuration options for the feed | [] |
Options Array
$options = [
'category' => 'Technology', // Feed category (string or array)
'interval' => 300, // Polling interval in seconds
'keep_history' => true, // Keep history of feed states
'max_items' => 0, // Maximum items to keep (0 = unlimited)
];
Example
// Register a feed with custom options
$manager->registerFeed('hacker_news', 'https://news.ycombinator.com/rss', [
'category' => 'Technology',
'interval' => 300,
'keep_history' => true,
'max_items' => 100
]);
// Register with multiple categories
$manager->registerFeed('tech_news', 'https://example.com/feed.xml', [
'category' => ['Technology', 'News'],
'interval' => 600
]);
Exceptions
| Exception | Condition |
|---|---|
RuntimeException |
If the feed cannot be fetched or parsed |
InvalidArgumentException |
If feed ID is empty or URL is invalid |
public
registerFeeds()
registerFeeds(array $feeds): self
Registers multiple feeds at once. Returns self for method chaining.
Parameters
| Parameter | Type | Description |
|---|---|---|
$feeds |
array | Associative array of feed configurations |
Example
// Register multiple feeds
$feeds = [
'techcrunch' => [
'url' => 'https://techcrunch.com/feed/',
'category' => 'Technology',
'interval' => 600
],
'bbc_news' => [
'url' => 'https://bbc.com/news/world/rss.xml',
'category' => 'News',
'interval' => 900,
'keep_history' => true
],
'hacker_news' => 'https://news.ycombinator.com/rss' // Simple format
];
$manager->registerFeeds($feeds);
public
checkUpdates()
checkUpdates(): array
array
Checks all registered feeds for updates. Returns an array of new items grouped by feed ID.
Return Value
[
'feed_id_1' => [
'feed' => Feed, // Updated feed object
'new_items' => Feed, // Feed containing only new items
'new_items_count' => int, // Number of new items
'is_updated' => bool // Whether feed was updated
],
// ...
]
Example
// Check for updates
$updates = $manager->checkUpdates();
// Process results
foreach ($updates as $feedId => $data) {
if ($data['is_updated']) {
echo "Feed '{$feedId}' has {$data['new_items_count']} new items\n";
foreach ($data['new_items']->items() as $item) {
echo "- {$item->title}\n";
}
}
}
public
export()
export(string $format = 'json', string $feedId = null): string
string
Exports feeds in various formats. Supported formats: json, rss, atom, jsonfeed, csv, html, text.
Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
$format |
string | Export format | 'json' |
$feedId |
string|null | Specific feed ID to export, null for all feeds | null |
Examples
// Export all feeds as JSON (for APIs)
$json = $manager->export('json');
header('Content-Type: application/json');
echo $json;
// Export specific feed as RSS
$rss = $manager->export('rss', 'hacker_news');
header('Content-Type: application/rss+xml');
echo $rss;
// Export as CSV for Excel
$csv = $manager->export('csv');
header('Content-Type: text/csv');
header('Content-Disposition: attachment; filename="feeds.csv"');
echo $csv;
// Export as HTML for web display
$html = $manager->export('html');
echo $html;
Exceptions
| Exception | Condition |
|---|---|
RuntimeException |
If feed ID is specified but not found |
RuntimeException |
If format is not supported |
public
getAllItems()
getAllItems(): array
FeedItem[]
Gets all items from all registered feeds, sorted by publication date (newest first).
Example
// Get all items from all feeds
$allItems = $manager->getAllItems();
echo "Total items: " . count($allItems) . "\n";
// Display latest items
foreach ($allItems as $item) {
echo "{$item->publishedAt->format('Y-m-d H:i')} - {$item->title}\n";
echo " Source: {$item->category}\n";
echo " Link: {$item->link}\n\n";
}
public
on()
on(string $event, Closure $handler): self
Registers an event handler for specific events. Returns self for method chaining.
Parameters
| Parameter | Type | Description |
|---|---|---|
$event |
string | Event name (e.g., 'feed:registered', 'item:new') |
$handler |
Closure | Callback function to handle the event |
Example
// Register event handlers
$manager->on('feed:registered', function($data) {
echo "Feed registered: {$data['feedId']}\n";
echo "URL: {$data['url']}\n";
});
$manager->on('item:new', function($data) {
$item = $data['item'];
$feedId = $data['feedId'];
// Send notification
sendNotification("New item in {$feedId}: {$item->title}");
// Save to database
saveItemToDatabase($item);
});
$manager->on('feed:updated', function($data) {
echo "Feed {$data['feedId']} updated with {$data['new_items_count']} new items\n";
});
Other Methods
-
getLatestItems(int $limit = 10): arrayGet latest items across all feeds -
getItemsByCategory(string $category): arrayGet items by category -
getLatestItemsByCategory(string $category, int $limit = 10): arrayGet latest items from a category -
searchItems(string $query): arraySearch across all feeds -
removeFeed(string $feedId): boolRemove a feed from tracking -
getFeedStatus(string $feedId): ?arrayGet feed status information -
getAllFeedsStatus(): arrayGet status for all feeds -
getStats(): arrayGet statistics about feeds and items -
getMetadata(): arrayGet complete collection metadata -
setConfig(string $key, mixed $value): selfSet configuration option -
getConfig(string $key = null): mixedGet configuration value or all config -
getCollection(): FeedCollectionGet the feed collection instance -
getScheduler(): FeedSchedulerGet the feed scheduler instance -
getFetcher(): FeedFetcherGet the feed fetcher instance -
forceUpdateFeed(string $feedId): boolForce update a specific feed -
forceUpdateAll(): arrayForce update all feeds
WebSubManager
Manages WebSub (PubSubHubbub) subscriptions and push notifications
The WebSubManager class handles real-time push notifications via the WebSub protocol. It automatically discovers hubs, manages subscriptions, and processes incoming notifications.
- WebSubSubscriber
- FeedManager
- FeedFetcher
- WebSubHandler
use Hosseinhunta\Huntfeed\WebSub\WebSubManager;
use Hosseinhunta\Huntfeed\Hub\FeedManager;
// Initialize WebSubManager
$feedManager = new FeedManager();
$webSubManager = new WebSubManager(
$feedManager,
'https://your-domain.com/websub-callback.php' // Callback URL
);
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$subscriber |
WebSubSubscriber | private | WebSub subscriber instance |
$feedManager |
FeedManager | private | Feed manager instance |
$feedHubs |
array | private | Mapping of feed IDs to hub URLs |
$autoSubscribe |
bool | private | Whether to auto-subscribe to detected hubs |
$fallbackToPolling |
bool | private | Whether to fallback to polling if no hub |
Constructor
__construct(FeedManager $feedManager, string $callbackUrl)
Key Methods
public
registerFeedWithWebSub()
registerFeedWithWebSub(string $feedId, string $feedUrl, array $options = []): array
array
Registers a feed with WebSub support. Automatically discovers the hub and subscribes for push notifications.
Parameters
| Parameter | Type | Description |
|---|---|---|
$feedId |
string | Unique feed identifier |
$feedUrl |
string | Feed URL |
$options |
array | Configuration options |
Return Value
[
'feed_id' => string,
'feed_url' => string,
'has_hub' => bool,
'hub_url' => string|null,
'subscription' => array|null,
'subscription_status' => string,
'success' => bool,
'error' => string|null
]
Example
// Register feed with WebSub
$result = $webSubManager->registerFeedWithWebSub(
'tech_news',
'https://example.com/feed.xml',
[
'category' => 'Technology',
'interval' => 600,
'keep_history' => true
]
);
if ($result['has_hub']) {
echo "WebSub hub detected: {$result['hub_url']}\n";
if ($result['subscription_status'] === 'pending_verification') {
echo "Subscription request sent. Awaiting verification...\n";
}
} else {
echo "No WebSub hub found. Falling back to polling.\n";
}
public
handleWebSubNotification()
handleWebSubNotification(string $body, array $headers, ?callable $onNewItems = null): array
array
Processes incoming WebSub notification. Verifies HMAC signature and extracts feed items.
Parameters
| Parameter | Type | Description |
|---|---|---|
$body |
string | Raw request body containing feed content |
$headers |
array | Request headers for signature verification |
$onNewItems |
callable|null | Callback when new items are detected |
Example
// In your callback endpoint (e.g., websub-callback.php)
$body = file_get_contents('php://input');
$headers = getallheaders();
$result = $webSubManager->handleWebSubNotification(
$body,
$headers,
function($items) {
// Process new items
foreach ($items as $item) {
echo "New item: {$item['title']}\n";
saveToDatabase($item);
sendNotification($item);
}
}
);
if ($result['success']) {
// Success - return 204 No Content
http_response_code(204);
} else {
// Error - return appropriate status
http_response_code(400);
echo $result['error'];
}
public
getSubscriptionStatus()
getSubscriptionStatus(): array
array
Gets subscription status for all feeds.
Return Value
[
'total_feeds' => int,
'websub_enabled_feeds' => int,
'verified_subscriptions' => int,
'subscriptions' => array,
'auto_subscribe' => bool,
'fallback_polling' => bool
]
Example
// Get subscription status
$status = $webSubManager->getSubscriptionStatus();
echo "Total feeds: {$status['total_feeds']}\n";
echo "WebSub enabled: {$status['websub_enabled_feeds']}\n";
echo "Verified subscriptions: {$status['verified_subscriptions']}\n";
foreach ($status['subscriptions'] as $subscription) {
echo "Feed: {$subscription['feed_url']}\n";
echo "Hub: {$subscription['hub_url']}\n";
echo "Status: " . ($subscription['verified'] ? 'Verified' : 'Pending') . "\n";
echo "Subscribed at: {$subscription['subscribed_at']}\n\n";
}
public
getHandler()
getHandler(): WebSubHandler
WebSubHandler
Gets a WebSubHandler instance for HTTP endpoint handling.
Example
// Create HTTP handler for your endpoint
$handler = $webSubManager->getHandler();
// Set up notification callback
$handler->onNotification(function($notification) {
// Process incoming notifications
foreach ($notification['items'] as $item) {
processNewItem($item);
}
});
// In your endpoint file
$method = $_SERVER['REQUEST_METHOD'];
$query = $_GET;
$body = file_get_contents('php://input');
$headers = getallheaders();
$result = $handler->processRequest($method, $query, $body, $headers);
http_response_code($result['status']);
echo $result['body'];
Configuration Methods
-
setAutoSubscribe(bool $enabled): selfEnable/disable auto-subscription to detected hubs -
setFallbackToPolling(bool $enabled): selfEnable/disable fallback to polling when no hub is found -
getSubscriber(): WebSubSubscriberGet the underlying WebSubSubscriber instance
Utility Methods
-
registerMultipleFeeds(array $feeds): arrayBatch register multiple feeds -
checkUpdates(): arrayCheck for updates (combines WebSub and polling) -
getStatistics(): arrayGet WebSub statistics -
getWebSubFeeds(): arrayGet list of feeds with WebSub hub -
static generateCallbackUrl(string $domain, string $path = '/websub'): stringstatic Generate callback URL for registration
- Always verify the callback URL is publicly accessible
- Use HTTPS for callback URLs in production
- Implement proper error handling for subscription failures
- Monitor subscription expiration and renew as needed
- Validate HMAC signatures for security
Feed
Represents a single RSS/Atom/JSON feed with items
- final class Feed implements JsonSerializable
The Feed class represents a parsed feed with metadata and items. It provides methods for filtering, searching, and managing feed items.
use Hosseinhunta\Huntfeed\Feed\Feed;
// Create a new feed
$feed = new Feed('https://example.com/feed.xml', 'Example Feed');
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$url |
string | public readonly | Feed URL |
$title |
string | public readonly | Feed title |
$items |
FeedItem[] | private | Array of feed items |
$duplicateMap |
array | private | Duplicate detection map |
$originalContent |
?string | private | Original feed content (XML/JSON) |
Constructor
__construct(string $url, string $title)
Key Methods
public
addItem()
addItem(FeedItem $item): void
Adds an item to the feed, preventing duplicates using fingerprinting.
Example
// Add an item to the feed
$item = new FeedItem(
id: 'item-123',
title: 'Example Article',
link: 'https://example.com/article',
content: 'Article content...',
enclosure: null,
publishedAt: new DateTimeImmutable('2024-01-15 10:30:00'),
category: 'Technology'
);
$feed->addItem($item);
// Add multiple items
$feed->addItems([$item1, $item2, $item3]);
public
items()
items(): array
FeedItem[]
Gets all items in the feed.
Example
// Get all items
$items = $feed->items();
echo "Total items: " . count($items) . "\n";
foreach ($items as $item) {
echo "- {$item->title} ({$item->publishedAt->format('Y-m-d')})\n";
}
public
getItemsSorted()
getItemsSorted(bool $descending = true): array
FeedItem[]
Gets items sorted by published date.
Example
// Get items sorted by date (newest first)
$sortedItems = $feed->getItemsSorted(true);
// Get items sorted by date (oldest first)
$sortedItems = $feed->getItemsSorted(false);
// Display sorted items
foreach ($sortedItems as $item) {
echo "{$item->publishedAt->format('Y-m-d H:i')} - {$item->title}\n";
}
public
findByCategory()
findByCategory(string $category): array
FeedItem[]
Finds items by category.
Example
// Find items by category
$techItems = $feed->findByCategory('Technology');
$newsItems = $feed->findByCategory('News');
echo "Technology items: " . count($techItems) . "\n";
echo "News items: " . count($newsItems) . "\n";
foreach ($techItems as $item) {
echo "- {$item->title}\n";
}
public
searchByTitle()
searchByTitle(string $query): array
FeedItem[]
Searches items by title (case-insensitive).
Example
// Search items by title
$results = $feed->searchByTitle('PHP');
echo "Found " . count($results) . " items containing 'PHP':\n";
foreach ($results as $item) {
echo "- {$item->title}\n";
}
// Search with partial matches
$results = $feed->searchByTitle('tutorial'); // Matches 'Tutorial', 'tutorial', etc.
public
getLatest()
getLatest(int $limit = 10): array
FeedItem[]
Gets the latest items from the feed.
Example
// Get latest 10 items
$latestItems = $feed->getLatest(10);
// Get latest 5 items
$latestItems = $feed->getLatest(5);
echo "Latest items:\n";
foreach ($latestItems as $item) {
echo "{$item->publishedAt->format('H:i')} - {$item->title}\n";
}
public
paginate()
paginate(int $page = 1, int $perPage = 10): array
FeedItem[]
Gets paginated items from the feed.
Example
// Get first page (items 1-10)
$page1 = $feed->paginate(1, 10);
// Get second page (items 11-20)
$page2 = $feed->paginate(2, 10);
// Get page 3 with 20 items per page
$page3 = $feed->paginate(3, 20);
echo "Page 1 items:\n";
foreach ($page1 as $item) {
echo "- {$item->title}\n";
}
public
findAfterDate()
findAfterDate(\DateTimeImmutable $date): array
FeedItem[]
Finds items published after a specific date.
Example
// Find items after January 1, 2024
$date = new DateTimeImmutable('2024-01-01');
$recentItems = $feed->findAfterDate($date);
// Find items after 24 hours ago
$yesterday = new DateTimeImmutable('-24 hours');
$recentItems = $feed->findAfterDate($yesterday);
echo "Items published in the last 24 hours:\n";
foreach ($recentItems as $item) {
echo "- {$item->title}\n";
}
public
findBeforeDate()
findBeforeDate(\DateTimeImmutable $date): array
FeedItem[]
Finds items published before a specific date.
Example
// Find items before January 1, 2024
$date = new DateTimeImmutable('2024-01-01');
$oldItems = $feed->findBeforeDate($date);
// Find items older than 30 days
$cutoff = new DateTimeImmutable('-30 days');
$oldItems = $feed->findBeforeDate($cutoff);
echo "Items older than 30 days:\n";
foreach ($oldItems as $item) {
echo "- {$item->title} ({$item->publishedAt->format('Y-m-d')})\n";
}
public
setOriginalContent()
setOriginalContent(string $content): self
Sets the original feed content (XML/JSON). Useful for WebSub hub detection.
Example
// Store original feed content
$xmlContent = file_get_contents('https://example.com/feed.xml');
$feed->setOriginalContent($xmlContent);
// Later retrieve it
$original = $feed->getOriginalContent();
// Detect WebSub hub
$hubUrl = WebSubSubscriber::detectHubFromFeed($original);
if ($hubUrl) {
echo "WebSub hub found: {$hubUrl}\n";
}
public
toArray()
toArray(): array
array
Converts the feed to an array representation.
Example
// Convert feed to array
$feedArray = $feed->toArray();
echo "Feed URL: {$feedArray['url']}\n";
echo "Feed title: {$feedArray['title']}\n";
echo "Total items: {$feedArray['items_count']}\n";
// Convert to JSON
$json = json_encode($feed->toArray(), JSON_PRETTY_PRINT);
file_put_contents('feed.json', $json);
public
getMetadata()
getMetadata(): array
array
Gets metadata about the feed.
Return Value
[
'url' => string,
'title' => string,
'total_items' => int,
'first_item_date' => string|null,
'latest_item_date' => string|null,
'categories' => array,
'items_with_enclosures' => int,
'duplicate_stats' => array
]
Example
// Get feed metadata
$metadata = $feed->getMetadata();
echo "Feed: {$metadata['title']}\n";
echo "URL: {$metadata['url']}\n";
echo "Total items: {$metadata['total_items']}\n";
echo "First item: {$metadata['first_item_date']}\n";
echo "Latest item: {$metadata['latest_item_date']}\n";
echo "Categories: " . implode(', ', $metadata['categories']) . "\n";
echo "Items with enclosures: {$metadata['items_with_enclosures']}\n";
public
getDuplicateStats()
getDuplicateStats(): array
array
Gets duplicate detection statistics.
Example
// Get duplicate statistics
$stats = $feed->getDuplicateStats();
echo "Total fingerprints: {$stats['total_fingerprints']}\n";
echo "Unique items: {$stats['unique_items']}\n";
echo "Duplicate map size: {$stats['duplicate_map_size']}\n";
// Calculate duplicates prevented
$duplicatesPrevented = $stats['duplicate_map_size'] - $stats['unique_items'];
echo "Duplicates prevented: {$duplicatesPrevented}\n";
Other Methods
-
itemsCount(): intGet total number of items -
getItemByFingerprint(string $fingerprint): ?FeedItemGet item by fingerprint -
hasItem(string $fingerprint): boolCheck if item exists by fingerprint -
removeItem(string $fingerprint): boolRemove an item by fingerprint -
clear(): voidRemove all items -
getOriginalContent(): ?stringGet original feed content -
jsonSerialize(): mixedJSON serialization (implements JsonSerializable)
FeedItem
Represents a single item/article within a feed
- final class FeedItem implements JsonSerializable
The FeedItem class represents a single item (article, post, entry) within a feed. It provides methods for comparison, fingerprinting, and data access.
use Hosseinhunta\Huntfeed\Feed\FeedItem;
// Create a new feed item
$item = new FeedItem(
id: 'article-123',
title: 'Example Article',
link: 'https://example.com/article',
content: 'Article content goes here...',
enclosure: 'https://example.com/image.jpg',
publishedAt: new DateTimeImmutable('2024-01-15 10:30:00'),
category: 'Technology',
extra: ['author' => 'John Doe', 'comments' => 5]
);
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$id |
string | public readonly | Unique identifier for this item |
$title |
string | public readonly | Item title |
$link |
string | public readonly | Item URL |
$content |
?string | public readonly | Item content/description |
$enclosure |
?string | public readonly | Enclosure/media URL |
$publishedAt |
DateTimeImmutable | public readonly | Publication date |
$category |
?string | public readonly | Item category/tag |
$extra |
?array | public readonly | Additional metadata from feed sources |
Constructor
__construct(
string $id,
string $title,
string $link,
?string $content,
?string $enclosure,
\DateTimeImmutable $publishedAt,
?string $category,
?array $extra = []
)
InvalidArgumentException if neither $id nor $link is provided.
Key Methods
public
fingerprint()
fingerprint(string $strategy = 'default'): string
string
Generates a unique fingerprint for this item. Used for duplicate detection.
Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
$strategy |
string | Hashing strategy | 'default' |
Fingerprint Strategies
- default: Uses item ID + link (fastest)
- content: Uses title + content + date (most accurate for duplicates across sources)
- fuzzy: Uses title + date (good for grouping similar items)
Example
// Get default fingerprint
$fingerprint = $item->fingerprint(); // SHA256 hash
// Get content-based fingerprint
$contentFingerprint = $item->fingerprint('content');
// Get fuzzy fingerprint
$fuzzyFingerprint = $item->fingerprint('fuzzy');
echo "Default fingerprint: {$fingerprint}\n";
echo "Content fingerprint: {$contentFingerprint}\n";
echo "Fuzzy fingerprint: {$fuzzyFingerprint}\n";
// Use for duplicate detection
$items = [];
if (!isset($items[$fingerprint])) {
$items[$fingerprint] = $item;
} else {
echo "Duplicate item detected!\n";
}
public
getExtra()
getExtra(string $key, mixed $default = null): mixed
Gets extra field value using dot notation for nested arrays.
Parameters
| Parameter | Type | Description |
|---|---|---|
$key |
string | Field key (supports dot notation for nested fields) |
$default |
mixed | Default value if key doesn't exist |
Example
// Get nested extra field
$authorName = $item->getExtra('author.name');
$thumbnail = $item->getExtra('media.thumbnail.0.url');
// With default value
$rating = $item->getExtra('rating', 0);
$comments = $item->getExtra('comments.count', 0);
// Check if field exists
if ($item->hasExtra('custom.field')) {
$value = $item->getExtra('custom.field');
echo "Custom field value: {$value}\n";
}
// Get all extra fields
$allExtra = $item->getExtraFields();
print_r($allExtra);
public
equals()
equals(FeedItem $other): bool
bool
Compares two items for equality using default fingerprinting.
Example
// Compare two items
if ($item1->equals($item2)) {
echo "Items are identical\n";
} else {
echo "Items are different\n";
}
// Check if items are similar (content-based)
if ($item1->isSimilar($item2)) {
echo "Items are similar (may be same article from different sources)\n";
}
// Compare with specific strategy
$fingerprint1 = $item1->fingerprint('content');
$fingerprint2 = $item2->fingerprint('content');
if ($fingerprint1 === $fingerprint2) {
echo "Items have identical content\n";
}
public
withCategory()
withCategory(string $category): self
FeedItem
Creates a new instance with updated category. Returns a new FeedItem instance.
Example
// Create item with one category
$item = new FeedItem(...);
// Create new item with different category
$categorizedItem = $item->withCategory('Technology');
echo "Original category: {$item->category}\n";
echo "New category: {$categorizedItem->category}\n";
// The original item is unchanged
assert($item->category !== $categorizedItem->category);
public
toArray()
toArray(): array
array
Converts item to array for API responses.
Example
// Convert item to array
$itemArray = $item->toArray();
echo "Item ID: {$itemArray['id']}\n";
echo "Title: {$itemArray['title']}\n";
echo "Published: {$itemArray['publishedAt']}\n";
echo "Fingerprint: {$itemArray['fingerprint']}\n";
// Convert to JSON
$json = json_encode($item->toArray(), JSON_PRETTY_PRINT);
echo $json;
Array Structure
[
'id' => string,
'title' => string,
'link' => string,
'content' => string|null,
'enclosure' => string|null,
'publishedAt' => string, // ISO 8601 format
'category' => string|null,
'extra' => array,
'fingerprint' => string
]
public
getMetadata()
getMetadata(): array
array
Gets basic metadata about this item.
Return Value
[
'id' => string,
'title' => string,
'url' => string,
'category' => string|null,
'published_date' => string, // ISO 8601 format
'has_content' => bool,
'has_enclosure' => bool,
'extra_fields_count' => int,
'fingerprint_default' => string,
'fingerprint_content' => string,
'fingerprint_fuzzy' => string
]
public
isValid()
isValid(): bool
bool
Checks if this item is valid (has required fields).
Example
// Check if item is valid
if ($item->isValid()) {
echo "Item is valid\n";
} else {
echo "Item is missing required fields\n";
}
// Invalid item example
$invalidItem = new FeedItem(
id: '', // Empty ID
title: 'Test',
link: '', // Empty link
content: null,
enclosure: null,
publishedAt: new DateTimeImmutable(),
category: null
);
if (!$invalidItem->isValid()) {
echo "Invalid item: missing ID or link\n";
}
Other Methods
-
hasExtra(string $key): boolCheck if a specific extra field exists -
getExtraFields(): arrayGet all extra fields -
isSimilar(FeedItem $other): boolCheck if this item is similar to another (content-wise) -
jsonSerialize(): mixedJSON serialization (implements JsonSerializable) -
__toString(): stringString representation of the item
FeedCollection
Manages a collection of feeds with categorization and search
- final class FeedCollection
The FeedCollection class manages multiple feeds, provides categorization, search, and statistics.
use Hosseinhunta\Huntfeed\Hub\FeedCollection;
// Create a feed collection
$collection = new FeedCollection('Uncategorized');
// Or with custom default category
$collection = new FeedCollection('General');
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$feeds |
array<string, Feed> | private | Registered feeds keyed by feed ID |
$categories |
array<string, array<string>> | private | Categories mapping: category_name => [feed_ids] |
$feedCategories |
array<string, string> | private | Feed categories mapping: feed_id => category |
$defaultCategory |
string | private | Default category name |
Constructor
__construct(string $defaultCategory = 'Uncategorized')
Key Methods
public
addFeed()
addFeed(string $feedId, Feed $feed, string|array|null $categories = null): self
Adds a feed to the collection with optional categorization.
Example
// Add feed with single category
$collection->addFeed('tech_news', $feed, 'Technology');
// Add feed with multiple categories
$collection->addFeed('mixed_news', $feed, ['Technology', 'Business', 'News']);
// Add feed with default category
$collection->addFeed('general_feed', $feed); // Uses 'Uncategorized'
// Method chaining
$collection
->addFeed('feed1', $feed1, 'Tech')
->addFeed('feed2', $feed2, ['News', 'Politics'])
->addFeed('feed3', $feed3);
public
getAllItems()
getAllItems(): array
FeedItem[]
Gets all items from all feeds with categories updated.
Example
// Get all items from all feeds
$allItems = $collection->getAllItems();
echo "Total items in collection: " . count($allItems) . "\n";
// Items are automatically categorized with feed's category
foreach ($allItems as $item) {
echo "{$item->title} [{$item->category}]\n";
}
public
getItemsByCategory()
getItemsByCategory(string $category): array
FeedItem[]
Gets items from a specific category (partial match supported).
Parameters
| Parameter | Type | Description |
|---|---|---|
$category |
string | Category name (supports partial matching) |
Example
// Get items by exact category match
$techItems = $collection->getItemsByCategory('Technology');
// Get items by partial match (e.g., "سمنان" matches "اخبار استان ها > سمنان")
$semnanItems = $collection->getItemsByCategory('سمنان');
// Display results
echo "Technology items: " . count($techItems) . "\n";
foreach ($techItems as $item) {
echo "- {$item->title}\n";
}
// Partial matching example
$category = 'اخبار استان ها > سمنان';
$collection->addFeed('semnan_news', $feed, $category);
// Both searches will match
$items1 = $collection->getItemsByCategory('سمنان'); // Partial match
$items2 = $collection->getItemsByCategory('اخبار استان ها > سمنان'); // Exact match
public
searchItems()
searchItems(string $query): array
FeedItem[]
Searches items across all feeds in title, content, category, and link fields.
Example
// Search across all feeds
$results = $collection->searchItems('آتش');
echo "Found " . count($results) . " items matching 'آتش':\n";
foreach ($results as $item) {
$source = "Unknown";
foreach ($collection->getAllFeeds() as $feedId => $feed) {
if (in_array($item, $feed->items(), true)) {
$source = $feedId;
break;
}
}
echo "- {$item->title} [{$source}]\n";
}
// Case-insensitive search
$results = $collection->searchItems('php'); // Matches 'PHP', 'php', 'Php'
// Search in multiple languages
$results = $collection->searchItems('ایران'); // Persian search
$results = $collection->searchItems('Iran'); // English search
public
getLatestItems()
getLatestItems(int $limit = 10): array
FeedItem[]
Gets the latest items from all feeds (sorted by publication date).
Example
// Get latest 10 items
$latest = $collection->getLatestItems(10);
// Get latest 5 items
$latest = $collection->getLatestItems(5);
echo "Latest items:\n";
foreach ($latest as $item) {
$timeAgo = time_ago($item->publishedAt);
echo "- {$item->title} ({$timeAgo})\n";
echo " Category: {$item->category}\n";
echo " Source: " . getFeedSource($item) . "\n\n";
}
// Get latest items by category
$latestTech = $collection->getLatestItemsByCategory('Technology', 5);
$latestNews = $collection->getLatestItemsByCategory('News', 5);
public
getStats()
getStats(): array
array
Gets collection statistics.
Return Value
[
'total_feeds' => int,
'total_categories' => int,
'total_items' => int,
'categories' => [
'category_name' => [
'feeds_count' => int,
'items_count' => int
],
// ...
],
'categories_list' => array,
'feeds_list' => array
]
Example
// Get collection statistics
$stats = $collection->getStats();
echo "Collection Statistics:\n";
echo "Total feeds: {$stats['total_feeds']}\n";
echo "Total categories: {$stats['total_categories']}\n";
echo "Total items: {$stats['total_items']}\n\n";
echo "Categories:\n";
foreach ($stats['categories'] as $category => $catStats) {
echo "- {$category}: {$catStats['feeds_count']} feeds, {$catStats['items_count']} items\n";
}
echo "\nAll categories: " . implode(', ', $stats['categories_list']) . "\n";
echo "All feeds: " . implode(', ', $stats['feeds_list']) . "\n";
public
getMetadata()
getMetadata(): array
array
Gets collection metadata including dates and enclosures.
Example
// Get collection metadata
$metadata = $collection->getMetadata();
echo "Metadata:\n";
echo "Earliest item: {$metadata['earliest_item']}\n";
echo "Latest item: {$metadata['latest_item']}\n";
echo "Feeds with enclosures: {$metadata['feeds_with_enclosures']}\n\n";
// Access stats through metadata
echo "Stats from metadata:\n";
print_r($metadata['stats']);
public
toArray()
toArray(): array
array
Exports collection to array.
Example
// Export collection to array
$array = $collection->toArray();
// Export to JSON
$json = $collection->toJSON(true); // Pretty print
file_put_contents('collection.json', $json);
// Export to JSON without pretty print (for APIs)
$json = $collection->toJSON(false);
header('Content-Type: application/json');
echo $json;
Other Methods
-
removeFeed(string $feedId): boolRemove a feed from the collection -
getFeed(string $feedId): ?FeedGet a feed by ID -
getAllFeeds(): arrayGet all feeds -
getFeedsByCategory(string $category): arrayGet feeds by category -
getCategories(): arrayGet all categories -
hasFeed(string $feedId): boolCheck if feed exists -
hasCategory(string $category): boolCheck if category exists -
getItemsByFeeds(array $feedIds): arrayGet items from specific feeds -
getItemsCount(): intGet total items count -
getItemsCountByCategory(string $category): intGet items count by category -
clear(): voidClear all feeds -
count(): intGet feeds count
FeedFetcher
Fetches and parses feeds from URLs with SSL support
- final class FeedFetcher
The FeedFetcher class fetches feed content from URLs using cURL, handles SSL verification, timeouts, redirects, and auto-detects feed format.
- AutoDetectParser
- Feed
use Hosseinhunta\Huntfeed\Transport\FeedFetcher;
// Create a feed fetcher
$fetcher = new FeedFetcher();
// Configure SSL (for development)
$fetcher->setVerifySSL(false);
// Set timeout
$fetcher->setTimeout(30);
// Set custom User-Agent
$fetcher->setUserAgent('MyApp/1.0 (+https://example.com)');
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$parser |
AutoDetectParser | private | Parser for auto-detecting feed format |
$timeout |
int | private | HTTP request timeout in seconds |
$maxRedirects |
int | private | Maximum redirects to follow |
$headers |
array | private | Custom HTTP headers |
$verifySsl |
bool | private | Whether to verify SSL certificates |
$caBundlePath |
?string | private | Path to CA bundle file |
Constructor
__construct(AutoDetectParser $parser = null)
Key Methods
public
fetch()
fetch(string $url): Feed
Feed
Fetches and parses a feed from URL.
Parameters
| Parameter | Type | Description |
|---|---|---|
$url |
string | Feed URL |
Example
// Fetch a feed
try {
$feed = $fetcher->fetch('https://example.com/feed.xml');
echo "Feed title: {$feed->title}\n";
echo "Items: {$feed->itemsCount()}\n";
// Access original content for WebSub detection
$originalContent = $feed->getOriginalContent();
if ($originalContent) {
$hubUrl = WebSubSubscriber::detectHubFromFeed($originalContent);
if ($hubUrl) {
echo "WebSub hub: {$hubUrl}\n";
}
}
} catch (RuntimeException $e) {
echo "Error fetching feed: " . $e->getMessage() . "\n";
}
Exceptions
| Exception | Condition |
|---|---|
RuntimeException |
If URL is invalid |
RuntimeException |
If feed cannot be fetched (network error) |
RuntimeException |
If feed cannot be parsed (invalid format) |
public
fetchMultiple()
fetchMultiple(array $urls): array
array<string, Feed>
Fetches multiple feeds and returns them as an associative array.
Parameters
| Parameter | Type | Description |
|---|---|---|
$urls |
array | Associative array: ['feed_id' => 'url' or ['url' => 'url', 'category' => 'cat']] |
Example
// Fetch multiple feeds
$urls = [
'hacker_news' => 'https://news.ycombinator.com/rss',
'techcrunch' => [
'url' => 'https://techcrunch.com/feed/',
'category' => 'Technology'
],
'bbc_news' => 'https://bbc.com/news/world/rss.xml'
];
$feeds = $fetcher->fetchMultiple($urls);
echo "Fetched " . count($feeds) . " feeds:\n";
foreach ($feeds as $feedId => $feed) {
echo "- {$feedId}: {$feed->title} ({$feed->itemsCount()} items)\n";
}
// Handle failures gracefully
$feeds = $fetcher->fetchMultiple($urls);
$successCount = count($feeds);
$totalCount = count($urls);
if ($successCount < $totalCount) {
$failedCount = $totalCount - $successCount;
echo "Warning: {$failedCount} feeds failed to fetch\n";
}
public
hasNewItems()
hasNewItems(Feed $oldFeed, Feed $newFeed): bool
bool
Checks if a feed has new items compared to previous fetch.
Example
// Check for new items
$oldFeed = $fetcher->fetch('https://example.com/feed.xml');
sleep(60); // Wait for possible updates
$newFeed = $fetcher->fetch('https://example.com/feed.xml');
if ($fetcher->hasNewItems($oldFeed, $newFeed)) {
echo "Feed has new items!\n";
// Get only new items
$newItemsFeed = $fetcher->getNewItems($oldFeed, $newFeed);
echo "New items count: {$newItemsFeed->itemsCount()}\n";
foreach ($newItemsFeed->items() as $item) {
echo "- {$item->title}\n";
}
} else {
echo "No new items\n";
}
public
getNewItems()
getNewItems(Feed $oldFeed, Feed $newFeed): Feed
Feed
Gets only new items from a feed compared to previous version.
Example
// Get new items between two feed versions
$newItemsFeed = $fetcher->getNewItems($oldFeed, $newFeed);
// The returned Feed contains only new items
echo "New items found: {$newItemsFeed->itemsCount()}\n";
// Process new items
foreach ($newItemsFeed->items() as $item) {
processNewItem($item);
sendNotification($item);
saveToDatabase($item);
}
// You can also use this with FeedManager events
$manager->on('item:new', function($data) {
$item = $data['item'];
// Process each new item
});
public
setVerifySSL()
setVerifySSL(bool $verify): self
Sets SSL verification (disable for development only!).
Example
// For development (self-signed certificates)
$fetcher->setVerifySSL(false);
// For production (always verify)
$fetcher->setVerifySSL(true);
// Set custom CA bundle path
$fetcher->setCaBundlePath('/path/to/cacert.pem');
public
setTimeout()
setTimeout(int $seconds): self
Sets HTTP request timeout in seconds.
Example
// Set timeout to 30 seconds (default)
$fetcher->setTimeout(30);
// Set shorter timeout for fast-failing
$fetcher->setTimeout(10);
// Set longer timeout for slow feeds
$fetcher->setTimeout(60);
// Combine with other settings
$fetcher
->setTimeout(30)
->setMaxRedirects(5)
->setUserAgent('MyApp/1.0')
->addHeader('Accept', 'application/xml, application/json');
Other Methods
-
setMaxRedirects(int $max): selfSet maximum redirects to follow -
addHeader(string $key, string $value): selfAdd custom HTTP header -
setUserAgent(string $userAgent): selfSet User-Agent header -
setCaBundlePath(string $path): selfSet custom CA bundle path -
detectCaBundlePath(): voidprivate Auto-detect CA bundle path for Windows/Linux/macOS -
fetchContent(string $url): stringprivate Fetch raw content from URL using cURL
- Auto-detection works for common CA bundle locations on Windows, Linux, and macOS
- For production, ensure proper CA bundle is configured
- Use
setCaBundlePath()for custom CA bundles - Test SSL verification in staging before production
FeedScheduler
Manages periodic polling of feeds
- final class FeedScheduler
The FeedScheduler class manages periodic polling of feeds, tracks update times, and maintains feed history.
- FeedFetcher
- Feed
use Hosseinhunta\Huntfeed\Engine\FeedScheduler;
// Create a feed scheduler
$scheduler = new FeedScheduler();
// Or with custom fetcher
$fetcher = new FeedFetcher();
$scheduler = new FeedScheduler($fetcher);
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$fetcher |
FeedFetcher | private | Feed fetcher instance |
$scheduledFeeds |
array | private | Scheduled feeds with configuration |
$feedHistory |
array<string, Feed[]> | private | Feed history for change detection |
Constructor
__construct(FeedFetcher $fetcher = null)
Key Methods
public
register()
register(string $feedId, string $url, int $intervalSeconds = 1800, bool $keepHistory = true): self
Registers a feed for periodic polling.
Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
$feedId |
string | Unique identifier for the feed | Required |
$url |
string | Feed URL | Required |
$intervalSeconds |
int | Update interval in seconds | 1800 (30 minutes) |
$keepHistory |
bool | Keep history of feed states | true |
Example
// Register a feed for polling
$scheduler->register('tech_news', 'https://example.com/feed.xml', 300); // 5 minutes
// Register with history disabled
$scheduler->register('fast_news', 'https://news.com/feed', 60, false); // 1 minute, no history
// Register multiple feeds
$scheduler->registerMultiple([
'feed1' => ['url' => 'https://feed1.com', 'interval' => 300],
'feed2' => ['url' => 'https://feed2.com', 'interval' => 600],
'feed3' => ['url' => 'https://feed3.com', 'interval' => 900],
]);
public
checkUpdates()
checkUpdates(): array
array
Checks all registered feeds for updates. Returns only feeds that have new content.
Return Value
[
'feed_id' => [
'feed' => Feed, // Updated feed object
'new_items' => Feed, // Feed containing only new items
'is_updated' => bool // Whether feed was updated
],
// ...
]
Example
// Check for updates (usually called periodically)
$updates = $scheduler->checkUpdates();
foreach ($updates as $feedId => $data) {
if ($data['is_updated']) {
$newItemsCount = $data['new_items']->itemsCount();
echo "Feed '{$feedId}' updated with {$newItemsCount} new items\n";
// Process new items
foreach ($data['new_items']->items() as $item) {
processNewItem($item);
}
// Update statistics
updateFeedStatistics($feedId, $newItemsCount);
}
}
// This is typically used in a cron job or background task
// Example cron job: * * * * * php /path/to/check-updates.php
public
forceUpdate()
forceUpdate(string $feedId): bool
bool
Force updates a specific feed immediately, ignoring the interval.
Example
// Force update a specific feed
if ($scheduler->forceUpdate('tech_news')) {
echo "Feed force-updated successfully\n";
// Get the updated feed
$feed = $scheduler->getFeed('tech_news');
echo "Latest items: {$feed->itemsCount()}\n";
} else {
echo "Feed not found or update failed\n";
}
// Force update all feeds
$forceUpdated = $manager->forceUpdateAll();
echo "Force updated " . count($forceUpdated) . " feeds\n";
public
getStatus()
getStatus(string $feedId): ?array
array|null
Gets feed update status.
Return Value
[
'feed_id' => string,
'url' => string,
'last_update' => string, // ISO 8601 format
'next_update' => string, // ISO 8601 format
'interval' => int,
'seconds_since_update' => int,
'items_count' => int
]
Example
// Get feed status
$status = $scheduler->getStatus('tech_news');
if ($status) {
echo "Feed: {$status['feed_id']}\n";
echo "URL: {$status['url']}\n";
echo "Last update: {$status['last_update']}\n";
echo "Next update: {$status['next_update']}\n";
echo "Interval: {$status['interval']} seconds\n";
echo "Seconds since update: {$status['seconds_since_update']}\n";
echo "Items count: {$status['items_count']}\n";
// Check if update is due
if ($status['seconds_since_update'] >= $status['interval']) {
echo "Update is due!\n";
}
}
// Get status for all feeds
$allStatus = $scheduler->getAllStatus();
foreach ($allStatus as $feedId => $status) {
// Process each feed's status
}
public
getHistory()
getHistory(string $feedId): ?array
Feed[]|null
Gets feed history (previous versions).
Example
// Get feed history
$history = $scheduler->getHistory('tech_news');
if ($history) {
echo "Feed history (last " . count($history) . " versions):\n";
foreach ($history as $index => $feed) {
$date = $feed->items()[0]->publishedAt ?? new DateTimeImmutable();
echo "Version {$index}: {$feed->itemsCount()} items ({$date->format('Y-m-d H:i')})\n";
}
// Compare current with previous version
if (count($history) >= 2) {
$current = $history[count($history) - 1];
$previous = $history[count($history) - 2];
$newItems = $fetcher->getNewItems($previous, $current);
echo "New items in last update: {$newItems->itemsCount()}\n";
}
}
Other Methods
-
registerMultiple(array $feeds): selfRegister multiple feeds at once -
getFeed(string $feedId): ?FeedGet a feed by ID -
getAllFeeds(): arrayGet all registered feeds -
getAllStatus(): arrayGet status of all feeds -
unregister(string $feedId): boolUnregister a feed -
clear(): voidClear all feeds -
count(): intGet registered feeds count
- Use appropriate intervals based on feed update frequency
- Enable history for feeds where change tracking is important
- Monitor feed status to detect stale or failing feeds
- Consider using WebSub for real-time updates instead of frequent polling
- Implement exponential backoff for failing feeds
ParserInterface
Interface for all feed parsers
- interface ParserInterface
The ParserInterface defines the contract for all feed parsers in HuntFeed. Each parser must implement methods to detect and parse specific feed formats.
namespace Hosseinhunta\Huntfeed\Parser;
use Hosseinhunta\Huntfeed\Feed\Feed;
interface ParserInterface
{
public function supports(string $xml): bool;
public function parse(string $xml, string $sourceUrl): Feed;
}
Methods
public
supports()
supports(string $xml): bool
bool
Checks if the parser supports the given feed content.
Parameters
| Parameter | Type | Description |
|---|---|---|
$xml |
string | Feed content (XML or JSON) |
Example
// Check if parser supports feed content
$parser = new Rss20Parser();
$content = file_get_contents('https://example.com/feed.xml');
if ($parser->supports($content)) {
echo "Parser supports this feed format\n";
$feed = $parser->parse($content, 'https://example.com/feed.xml');
} else {
echo "Parser does not support this feed format\n";
}
public
parse()
parse(string $xml, string $sourceUrl): Feed
Feed
Parses feed content and returns a Feed object.
Parameters
| Parameter | Type | Description |
|---|---|---|
$xml |
string | Feed content (XML or JSON) |
$sourceUrl |
string | Original feed URL |
Example
// Parse feed content
$parser = new AtomParser();
$content = file_get_contents('https://example.com/atom.xml');
try {
$feed = $parser->parse($content, 'https://example.com/atom.xml');
echo "Feed title: {$feed->title}\n";
echo "Items parsed: {$feed->itemsCount()}\n";
// Store original content for WebSub detection
$feed->setOriginalContent($content);
} catch (Exception $e) {
echo "Parse error: " . $e->getMessage() . "\n";
}
Exceptions
| Exception | Condition |
|---|---|
RuntimeException |
If feed content cannot be parsed |
InvalidArgumentException |
If feed content is invalid or empty |
Implementations
The following classes implement ParserInterface:
- Rss20Parser - Parses RSS 2.0 feeds
- AtomParser - Parses Atom feeds
- JsonFeedParser - Parses JSON Feed format
- RdfParser - Parses RDF/RSS 1.0 feeds
- GeoRssParser - Parses GeoRSS feeds
You can create custom parsers by implementing this interface. This allows HuntFeed to support custom or proprietary feed formats.
AutoDetectParser
Automatically detects and uses the appropriate parser for feed content
- final class AutoDetectParser
The AutoDetectParser automatically detects the feed format and uses the appropriate parser. It supports all built-in parsers and can be extended with custom parsers.
- ParserInterface (all implementations)
- Feed
use Hosseinhunta\Huntfeed\Parser\AutoDetectParser;
// Create auto-detect parser with all default parsers
$parser = AutoDetectParser::createDefault();
// Or create with custom parsers
$parser = new AutoDetectParser(
new Rss20Parser(),
new AtomParser(),
new JsonFeedParser()
// Add custom parsers here
);
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$parsers |
ParserInterface[] | private | Array of registered parsers |
Constructor
__construct(ParserInterface ...$parsers)
Static Methods
static
public
createDefault()
createDefault(): self
AutoDetectParser
Creates an instance with all available parsers pre-registered.
Example
// Create parser with all default parsers
$parser = AutoDetectParser::createDefault();
// This includes:
// - GeoRssParser
// - Rss20Parser
// - AtomParser
// - JsonFeedParser
// - RdfParser
// Use with FeedFetcher
$fetcher = new FeedFetcher($parser);
Instance Methods
public
addParser()
addParser(ParserInterface $parser): self
Adds a custom parser to the list.
Example
// Create auto-detect parser
$parser = AutoDetectParser::createDefault();
// Add custom parser
class CustomParser implements ParserInterface {
public function supports(string $xml): bool {
return str_contains($xml, '');
}
public function parse(string $xml, string $sourceUrl): Feed {
// Parse custom format
$feed = new Feed($sourceUrl, 'Custom Feed');
// Add items...
return $feed;
}
}
$parser->addParser(new CustomParser());
// Now the auto-detect parser will also try CustomParser
$feed = $parser->parse($customFeedContent, 'https://example.com/custom.xml');
public
parse()
parse(string $content, string $url): Feed
Feed
Parses feed content and automatically detects its format.
Example
// Parse any feed format automatically
$parser = AutoDetectParser::createDefault();
// Feed content (could be RSS, Atom, JSON Feed, etc.)
$content = file_get_contents('https://example.com/feed');
try {
$feed = $parser->parse($content, 'https://example.com/feed');
echo "Feed format detected automatically\n";
echo "Title: {$feed->title}\n";
echo "Items: {$feed->itemsCount()}\n";
// The parser automatically used the correct parser:
// - Rss20Parser for RSS 2.0
// - AtomParser for Atom
// - JsonFeedParser for JSON Feed
// - RdfParser for RDF/RSS 1.0
// - GeoRssParser for GeoRSS
} catch (RuntimeException $e) {
echo "Unsupported feed format: " . $e->getMessage() . "\n";
echo "Supported formats: GeoRSS, RSS 2.0, Atom, JSON Feed, RDF/RSS 1.0\n";
}
Detection Order
The parser tries each registered parser in order until one returns true for supports():
- GeoRssParser
- Rss20Parser
- AtomParser
- JsonFeedParser
- RdfParser
- Any custom parsers added with
addParser()
Exceptions
| Exception | Condition |
|---|---|
RuntimeException |
If no parser supports the feed format |
The auto-detect parser is very efficient. It only checks enough of the feed content to determine the format, and then delegates parsing to the appropriate parser.
Rss20Parser
Parses RSS 2.0 feeds with namespace support
- final class Rss20Parser implements ParserInterface
The Rss20Parser parses RSS 2.0 feeds with support for common namespaces including Dublin Core, Media RSS, and Content namespace.
use Hosseinhunta\Huntfeed\Parser\Rss20Parser;
// Create RSS 2.0 parser
$parser = new Rss20Parser();
// Check if content is RSS 2.0
if ($parser->supports($content)) {
$feed = $parser->parse($content, 'https://example.com/rss.xml');
}
Supported Namespaces
content: http://purl.org/rss/1.0/modules/content/- Full contentdc: http://purl.org/dc/elements/1.1/- Dublin Core metadatamedia: http://search.yahoo.com/mrss/- Media RSSatom: http://www.w3.org/2005/Atom- Atom links in RSS
Detection Method
supports(string $xml): bool
Returns true if the XML contains <rss> tag.
Parsing Features
public
parse()
parse(string $xml, string $sourceUrl): Feed
Parses RSS 2.0 feed content.
Extracted Fields
- Standard RSS: title, link, description, pubDate, guid, category, author, comments, enclosure
- Dublin Core: dc:creator, dc:date, dc:subject
- Content: content:encoded (full article content)
- Media RSS: media:content, media:thumbnail, media:title, media:description
- Custom Fields: All other elements are preserved in extra fields
Example RSS 2.0 Feed
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Example Feed</title>
<link>https://example.com</link>
<description>Example RSS Feed</description>
<item>
<title>Example Article</title>
<link>https://example.com/article</link>
<description>Short description</description>
<content:encoded><![CDATA[<p>Full article content</p>]]></content:encoded>
<pubDate>Mon, 15 Jan 2024 10:30:00 GMT</pubDate>
<guid>article-123</guid>
<category>Technology</category>
<dc:creator>John Doe</dc:creator>
<dc:date>2024-01-15T10:30:00Z</dc:date>
<media:content url="https://example.com/image.jpg" type="image/jpeg"/>
<enclosure url="https://example.com/podcast.mp3" length="123456" type="audio/mpeg"/>
</item>
</channel>
</rss>
Extra Fields Structure
// Example extra fields for the above RSS item
$extra = [
'author' => 'John Doe', // From or dc:creator
'content_encoded' => '<p>Full article content</p>',
'creator' => 'John Doe', // From dc:creator
'dc_date' => '2024-01-15T10:30:00Z',
'media_content' => [
[
'url' => 'https://example.com/image.jpg',
'type' => 'image/jpeg',
'medium' => ''
]
],
'custom_fields' => [
// Any other non-standard elements
]
];
private
extractExtra()
extractExtra(\SimpleXMLElement $item): array
Extracts extra metadata from RSS item. This method handles all namespaces and custom fields.
- Supports both
<description>andcontent:encodedfor content - Automatically extracts enclosure URLs
- Preserves all custom fields in
extra['custom_fields'] - Handles Media RSS for images, videos, and audio
- Supports Atom links within RSS feeds
AtomParser
Parses Atom feeds with comprehensive metadata extraction
- final class AtomParser implements ParserInterface
The AtomParser parses Atom feeds with support for authors, contributors, categories, links, and various extensions including Dublin Core and Media RSS.
use Hosseinhunta\Huntfeed\Parser\AtomParser;
// Create Atom parser
$parser = new AtomParser();
// Check if content is Atom feed
if ($parser->supports($content)) {
$feed = $parser->parse($content, 'https://example.com/atom.xml');
}
Supported Namespaces
atom: http://www.w3.org/2005/Atom- Core Atom namespacedc: http://purl.org/dc/elements/1.1/- Dublin Core metadatamedia: http://search.yahoo.com/mrss/- Media RSS
Detection Method
supports(string $xml): bool
Returns true if the XML contains <feed> tag with Atom namespace.
Parsing Features
public
parse()
parse(string $xml, string $sourceUrl): Feed
Parses Atom feed content with comprehensive metadata extraction.
Extracted Fields
- Core Atom: id, title, updated, published, summary, content, author, contributor, category, link
- Links: Handles multiple links with rel attributes (alternate, enclosure, related, etc.)
- Authors/Contributors: name, email, uri
- Categories: term, scheme, label
- Dublin Core: dc:creator, dc:subject, dc:rights
- Media RSS: media:content, media:thumbnail
- Source: For republished entries
- Rights: Copyright information
Example Atom Feed
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Example Atom Feed</title>
<link href="https://example.com"/>
<updated>2024-01-15T10:30:00Z</updated>
<entry>
<title>Example Article</title>
<link href="https://example.com/article"/>
<link rel="enclosure" href="https://example.com/image.jpg" type="image/jpeg"/>
<id>urn:uuid:article-123</id>
<published>2024-01-15T10:30:00Z</published>
<updated>2024-01-15T11:00:00Z</updated>
<summary>Short summary</summary>
<content type="html"><![CDATA[<p>Full article content</p>]]></content>
<author>
<name>John Doe</name>
<email>john@example.com</email>
<uri>https://example.com/author/john</uri>
</author>
<category term="Technology" scheme="https://example.com/categories" label="Tech"/>
<contributor>
<name>Jane Smith</name>
</contributor>
<rights>Copyright 2024</rights>
<dc:creator>John Doe</dc:creator>
<media:thumbnail url="https://example.com/thumb.jpg"/>
</entry>
</feed>
Link Handling
Atom allows multiple links per entry with different relationships. The parser:
- Uses
rel="alternate"or no rel attribute as the primary link - Extracts
rel="enclosure"as the enclosure URL - Stores all links in
extra['links']with their relationships
Date Handling
The parser uses the following order for publication date:
publishedelement (preferred)updatedelement (fallback)- Current date/time (if neither exists)
Content Handling
The parser uses the following order for content:
summaryelement (preferred for Atom)contentelement (fallback)
Extra Fields Structure
// Example extra fields for the above Atom entry
$extra = [
'author' => [
'name' => 'John Doe',
'email' => 'john@example.com',
'uri' => 'https://example.com/author/john'
],
'updated' => '2024-01-15T11:00:00Z',
'links' => [
'alternate' => [
'href' => 'https://example.com/article',
'type' => '',
'hreflang' => ''
],
'enclosure' => [
'href' => 'https://example.com/image.jpg',
'type' => 'image/jpeg',
'hreflang' => ''
]
],
'categories' => [
[
'term' => 'Technology',
'scheme' => 'https://example.com/categories',
'label' => 'Tech'
]
],
'contributors' => [
[
'name' => 'Jane Smith',
'email' => '',
'uri' => ''
]
],
'dc_creator' => 'John Doe',
'media_thumbnail' => 'https://example.com/thumb.jpg',
'rights' => 'Copyright 2024',
'published' => '2024-01-15T10:30:00Z'
];
private
extractExtra()
extractExtra(\SimpleXMLElement $entry): array
Extracts comprehensive metadata from Atom entry.
private
getEnclosure()
getEnclosure(\SimpleXMLElement $links): ?string
Extracts enclosure URL from Atom links (link with rel="enclosure").
- Atom has better support for multiple authors and contributors
- Atom supports multiple categories with labels and schemes
- Atom's link system is more flexible with relationship types
- Atom has separate
publishedandupdateddates - Atom is XML namespace-based by design
JsonFeedParser
Parses JSON Feed format (https://jsonfeed.org)
- final class JsonFeedParser implements ParserInterface
The JsonFeedParser parses JSON Feed format, a modern alternative to XML-based feeds. It supports both HTML and text content, attachments, authors, and tags.
use Hosseinhunta\Huntfeed\Parser\JsonFeedParser;
// Create JSON Feed parser
$parser = new JsonFeedParser();
// Check if content is JSON Feed
if ($parser->supports($content)) {
$feed = $parser->parse($content, 'https://example.com/feed.json');
}
Supported JSON Feed Version
Supports JSON Feed version 1.1 (https://jsonfeed.org/version/1.1)
Detection Method
supports(string $content): bool
Returns true if content is valid JSON and contains version field with "https://jsonfeed.org".
Parsing Features
public
parse()
parse(string $content, string $sourceUrl): Feed
Parses JSON Feed content.
Extracted Fields
- Core: id, title, url, content_html, content_text, summary
- Dates: date_published, date_modified
- Authors: author object with name, url, avatar
- Tags: Array of tags/categories
- Attachments: Array of attachments (enclosures)
- Images: image, banner_image
- External URL: For entries linking to external content
- Language: Content language
Example JSON Feed
{
"version": "https://jsonfeed.org/version/1.1",
"title": "Example JSON Feed",
"home_page_url": "https://example.com",
"feed_url": "https://example.com/feed.json",
"items": [
{
"id": "article-123",
"title": "Example Article",
"url": "https://example.com/article",
"content_html": "<p>Full article content</p>",
"content_text": "Full article content",
"summary": "Short summary",
"date_published": "2024-01-15T10:30:00Z",
"date_modified": "2024-01-15T11:00:00Z",
"author": {
"name": "John Doe",
"url": "https://example.com/author/john",
"avatar": "https://example.com/avatar.jpg"
},
"tags": ["Technology", "Programming"],
"attachments": [
{
"url": "https://example.com/podcast.mp3",
"mime_type": "audio/mpeg",
"title": "Podcast Episode",
"size_in_bytes": 123456,
"duration_in_seconds": 3600
}
],
"image": "https://example.com/image.jpg",
"banner_image": "https://example.com/banner.jpg",
"external_url": "https://external.com/article",
"language": "en"
}
]
}
Content Handling
The parser prefers content_html over content_text for the main content field.
Date Handling
The parser uses the following order for publication date:
date_published(preferred)date_modified(fallback)- Current date/time (if neither exists)
Enclosure Handling
The first attachment is used as the enclosure URL. All attachments are stored in extra['attachments'].
Category Handling
The first tag is used as the category. All tags are stored in extra['tags'].
Extra Fields Structure
// Example extra fields for the above JSON Feed item
$extra = [
'author' => [
'name' => 'John Doe',
'url' => 'https://example.com/author/john',
'avatar' => 'https://example.com/avatar.jpg'
],
'tags' => ['Technology', 'Programming'],
'attachments' => [
[
'url' => 'https://example.com/podcast.mp3',
'mime_type' => 'audio/mpeg',
'title' => 'Podcast Episode',
'size_in_bytes' => 123456,
'duration_in_seconds' => 3600
]
],
'date_published' => '2024-01-15T10:30:00Z',
'date_modified' => '2024-01-15T11:00:00Z',
'content_html' => '<p>Full article content</p>',
'content_text' => 'Full article content',
'summary' => 'Short summary',
'image' => 'https://example.com/image.jpg',
'banner_image' => 'https://example.com/banner.jpg',
'external_url' => 'https://external.com/article',
'language' => 'en'
];
private
extractExtra()
extractExtra(array $item): array
Extracts extra metadata from JSON Feed item.
private
isJson()
isJson(string $content): bool
Checks if content is valid JSON.
- Modern alternative to XML-based feeds
- Easier to parse and generate
- Supports both HTML and plain text content
- Rich author and attachment metadata
- Built-in support for tags/categories
- Better support for modern web content
RdfParser
Parses RDF/RSS 1.0 feeds
- final class RdfParser implements ParserInterface
The RdfParser parses RDF/RSS 1.0 feeds with support for Dublin Core metadata and RDF attributes.
use Hosseinhunta\Huntfeed\Parser\RdfParser;
// Create RDF parser
$parser = new RdfParser();
// Check if content is RDF/RSS 1.0
if ($parser->supports($content)) {
$feed = $parser->parse($content, 'https://example.com/rdf.xml');
}
Supported Namespaces
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#- RDF syntaxrss: http://purl.org/rss/1.0/- RSS 1.0 namespacedc: http://purl.org/dc/elements/1.1/- Dublin Core metadatacontent: http://purl.org/rss/1.0/modules/content/- Content namespacemedia: http://search.yahoo.com/mrss/- Media RSS
Detection Method
supports(string $xml): bool
Returns true if the XML contains <rdf:RDF> tag with RSS 1.0 namespace.
Parsing Features
public
parse()
parse(string $xml, string $sourceUrl): Feed
Parses RDF/RSS 1.0 feed content.
Extracted Fields
- RDF: rdf:about attribute
- RSS 1.0: title, link, description
- Dublin Core: dc:creator, dc:subject, dc:date, dc:language, dc:rights, dc:contributor, dc:publisher, dc:relation, dc:coverage
- Content: content:encoded (full article content)
- Media RSS: media:content, media:thumbnail
- Standard RSS: author, comments, category
Example RDF/RSS 1.0 Feed
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rss="http://purl.org/rss/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rss:channel rdf:about="https://example.com/">
<rss:title>Example RDF Feed</rss:title>
<rss:link>https://example.com</rss:link>
<rss:items>
<rdf:Seq>
<rdf:li rdf:resource="https://example.com/article"/>
</rdf:Seq>
</rss:items>
</rss:channel>
<rss:item rdf:about="https://example.com/article">
<rss:title>Example Article</rss:title>
<rss:link>https://example.com/article</rss:link>
<rss:description>Article description</rss:description>
<dc:creator>John Doe</dc:creator>
<dc:date>2024-01-15T10:30:00Z</dc:date>
<dc:subject>Technology</dc:subject>
<dc:language>en</dc:language>
<content:encoded><![CDATA[<p>Full article content</p>]]></content:encoded>
</rss:item>
</rdf:RDF>
Date Handling
Uses dc:date for publication date, falls back to current date/time if not present.
Extra Fields Structure
// Example extra fields for the above RDF item
$extra = [
'about' => 'https://example.com/article', // rdf:about attribute
'creator' => 'John Doe', // dc:creator
'subject' => 'Technology', // dc:subject
'dc_date' => '2024-01-15T10:30:00Z',
'language' => 'en', // dc:language
'content_encoded' => '<p>Full article content</p>'
];
private
extractExtra()
extractExtra(\SimpleXMLElement $item): array
Extracts extra metadata from RDF item.
GeoRssParser
Parses GeoRSS feeds with geographic data
- final class GeoRssParser implements ParserInterface
The GeoRssParser parses GeoRSS feeds, which are RSS or Atom feeds with geographic information. It supports both Simple and GML formats.
use Hosseinhunta\Huntfeed\Parser\GeoRssParser;
// Create GeoRSS parser
$parser = new GeoRssParser();
// Check if content is GeoRSS
if ($parser->supports($content)) {
$feed = $parser->parse($content, 'https://example.com/georss.xml');
}
Supported Namespaces
georss: http://www.georss.org/georss- GeoRSS Simple formatgml: http://www.opengis.net/gml- Geography Markup Languageatom: http://www.w3.org/2005/Atom- Atom format
Detection Method
supports(string $xml): bool
Returns true if the XML contains GeoRSS namespace and is either RSS or Atom format.
Parsing Features
public
parse()
parse(string $xml, string $sourceUrl): Feed
Parses GeoRSS feed content with geographic data extraction.
Supported GeoRSS Formats
- Point: Single coordinate (latitude, longitude)
- Line: Series of coordinates
- Polygon: Closed shape coordinates
- Box: Bounding box (southwest and northeast corners)
- GML Point: Geography Markup Language format
Example GeoRSS Feed
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:georss="http://www.georss.org/georss">
<channel>
<title>Earthquake Feed</title>
<item>
<title>Magnitude 5.5 Earthquake</title>
<link>https://example.com/earthquake/123</link>
<description>Earthquake near San Francisco</description>
<pubDate>Mon, 15 Jan 2024 10:30:00 GMT</pubDate>
<georss:point>37.7749 -122.4194</georss:point>
<georss:featureName>San Francisco</georss:featureName>
</item>
<item>
<title>Storm Path</title>
<link>https://example.com/storm/456</link>
<description>Storm tracking path</description>
<pubDate>Mon, 15 Jan 2024 11:00:00 GMT</pubDate>
<georss:line>37.0 -122.0 38.0 -121.0 39.0 -120.0</georss:line>
</item>
</channel>
</rss>
Geo Data Structure
// Example geo data for the above GeoRSS items
// First item (point):
$geoData1 = [
'type' => 'point',
'latitude' => 37.7749,
'longitude' => -122.4194,
'featureName' => 'San Francisco'
];
// Second item (line):
$geoData2 = [
'type' => 'line',
'coordinates' => [
['latitude' => 37.0, 'longitude' => -122.0],
['latitude' => 38.0, 'longitude' => -121.0],
['latitude' => 39.0, 'longitude' => -120.0]
]
];
// Stored in extra['geo'] field
$item->getExtra('geo'); // Returns geo data array
Format Detection
The parser automatically detects whether the feed is RSS or Atom format and parses accordingly.
Extra Fields Structure
// Example extra fields for GeoRSS item
$extra = [
'geo' => [
'type' => 'point',
'latitude' => 37.7749,
'longitude' => -122.4194,
'featureName' => 'San Francisco'
],
// For Atom feeds, also includes author
'author' => 'John Doe' // If present in Atom feed
];
private
extractGeoData()
extractGeoData(\SimpleXMLElement $item): ?array
Extracts geographic data from GeoRSS item.
- Earthquake and natural disaster feeds
- Weather and storm tracking
- Location-based news and events
- GPS tracking and route data
- Geographic information systems (GIS)
- Mapping and navigation applications
WebSubSubscriber
Handles subscription to WebSub hubs for push-based feed updates
The WebSubSubscriber class manages WebSub (PubSubHubbub) subscriptions for push-based feed updates instead of polling.
- FeedFetcher
use Hosseinhunta\Huntfeed\WebSub\WebSubSubscriber;
use Hosseinhunta\Huntfeed\Transport\FeedFetcher;
// Create WebSub subscriber
$fetcher = new FeedFetcher();
$subscriber = new WebSubSubscriber(
$fetcher,
'https://your-domain.com/websub-callback.php' // Callback URL
);
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$fetcher |
FeedFetcher | private | Feed fetcher instance |
$callbackUrl |
string | private | Callback URL for hub notifications |
$subscriptions |
array | private | Active subscriptions data |
$leaseSeconds |
int | private | Subscription lease duration in seconds |
$verificationTimeout |
int | private | Verification timeout in seconds |
Constructor
__construct(FeedFetcher $fetcher, string $callbackUrl)
Key Methods
public
subscribe()
subscribe(string $feedUrl, string $hubUrl, ?callable $onVerification = null): array
array
Subscribes to a WebSub hub for a given feed.
Example
// Subscribe to a WebSub hub
$result = $subscriber->subscribe(
'https://example.com/feed.xml', // Feed URL
'https://pubsubhubbub.com/hub', // Hub URL
function($data) {
// Called when subscription is verified
echo "Subscription verified for {$data['feed_url']}\n";
echo "Mode: {$data['mode']}\n";
echo "Lease seconds: {$data['lease_seconds']}\n";
}
);
if ($result['success']) {
echo "Subscription request sent. Subscription ID: {$result['subscription_id']}\n";
echo "Awaiting verification from hub...\n";
// Store subscription ID for later reference
$subscriptionId = $result['subscription_id'];
} else {
echo "Subscription failed: {$result['error']}\n";
}
public
verifyChallenge()
verifyChallenge(array $params): array
array
Verifies a subscription challenge from hub. Called by the hub to verify that our callback URL is valid.
Example Callback Endpoint
// In your callback endpoint (websub-callback.php)
$params = $_GET;
$result = $subscriber->verifyChallenge($params);
if ($result['success']) {
// Return the challenge to complete verification
echo $result['challenge'];
} else {
http_response_code(403);
echo "Verification failed: {$result['error']}";
}
public
handleNotification()
handleNotification(string $body, array $headers): array
array
Handles incoming push notification from hub. Verifies HMAC signature and extracts feed items.
Example
// In your callback endpoint for POST requests
$body = file_get_contents('php://input');
$headers = getallheaders();
$result = $subscriber->handleNotification($body, $headers);
if ($result['success']) {
// Process the new items
echo "Received {$result['items_count']} new items\n";
foreach ($result['items'] as $item) {
echo "- {$item['title']}\n";
echo " Link: {$item['link']}\n";
echo " Published: {$item['pubDate']}\n";
// Save to database, send notifications, etc.
saveItemToDatabase($item);
}
// Return 204 No Content to acknowledge receipt
http_response_code(204);
} else {
http_response_code(400);
echo "Invalid notification: {$result['error']}";
}
static
public
detectHubFromFeed()
detectHubFromFeed(string $feedContent): ?string
string|null
Detects WebSub hub URL in a feed. Parses RSS/Atom feed to find hub link according to spec.
Example
// Fetch feed and detect hub
$feedContent = file_get_contents('https://example.com/feed.xml');
$hubUrl = WebSubSubscriber::detectHubFromFeed($feedContent);
if ($hubUrl) {
echo "WebSub hub detected: {$hubUrl}\n";
// Subscribe to the hub
$subscriber->subscribe('https://example.com/feed.xml', $hubUrl);
} else {
echo "No WebSub hub found in feed\n";
}
public
getSubscriptionStatus()
getSubscriptionStatus(?string $feedUrl = null): array
array
Gets subscription status for a specific feed or all subscriptions.
Example
// Get status for specific feed
$status = $subscriber->getSubscriptionStatus('https://example.com/feed.xml');
// Get status for all subscriptions
$allStatus = $subscriber->getSubscriptionStatus();
echo "Total subscriptions: {$allStatus['total_subscriptions']}\n";
foreach ($allStatus['subscriptions'] as $subscription) {
echo "Feed: {$subscription['feed_url']}\n";
echo "Hub: {$subscription['hub_url']}\n";
echo "Verified: " . ($subscription['verified'] ? 'Yes' : 'No') . "\n";
echo "Subscribed at: {$subscription['subscribed_at']}\n";
echo "Lease seconds: {$subscription['lease_seconds']}\n\n";
}
public
unsubscribe()
unsubscribe(string $feedUrl): array
array
Unsubscribes from a WebSub hub.
Configuration Methods
-
setCallbackUrl(string $url): selfSet callback URL -
setLeaseSeconds(int $seconds): selfSet lease seconds
Statistics Methods
-
getSubscriptionCount(): intGet total subscription count -
getVerifiedCount(): intGet verified subscription count
- Hub sends GET request with
hub_challengeparameter for verification - After verification, hub sends POST requests with feed content
- HMAC signature in
X-Hub-Signatureheader verifies authenticity - Subscriptions expire and need to be renewed (lease seconds)
- Always use HTTPS for callback URLs in production
WebSubHandler
Receives and processes WebSub HTTP callbacks
The WebSubHandler class provides an HTTP endpoint handler for WebSub callbacks. It processes both GET requests for verification challenges and POST requests for push notifications.
- WebSubSubscriber
use Hosseinhunta\Huntfeed\WebSub\WebSubHandler;
use Hosseinhunta\Huntfeed\WebSub\WebSubSubscriber;
// Create handler
$subscriber = new WebSubSubscriber($fetcher, 'https://example.com/websub-callback.php');
$handler = new WebSubHandler($subscriber);
// Set up notification callback
$handler->onNotification(function($notification) {
// Process incoming notifications
foreach ($notification['items'] as $item) {
processNewItem($item);
}
});
Properties
| Property | Type | Access | Description |
|---|---|---|---|
$subscriber |
WebSubSubscriber | private | WebSub subscriber instance |
$onNotification |
?Closure | private | Notification callback function |
Constructor
__construct(WebSubSubscriber $subscriber)
Key Methods
public
onNotification()
onNotification(callable $callback): self
Sets callback when notification is received.
Example
// Set up notification handler
$handler->onNotification(function($notification) {
// $notification contains:
// - success: bool
// - items_count: int
// - items: array of feed items
// - message: string
if ($notification['success']) {
echo "Received {$notification['items_count']} new items via WebSub\n";
foreach ($notification['items'] as $item) {
// Process each item
saveToDatabase($item);
sendNotification($item);
updateStatistics($item);
}
// You could also trigger other actions
clearCache();
updateSitemap();
sendWebhook('websub:notification', $notification);
} else {
error_log("WebSub notification error: {$notification['error']}");
}
});
public
processRequest()
processRequest(string $method, array $query, string $body, array $headers): array
array
Processes incoming HTTP request. Handles both GET (verification) and POST (notification) requests.
Example HTTP Endpoint
// Complete WebSub callback endpoint example
// Save as websub-callback.php
require_once 'vendor/autoload.php';
use Hosseinhunta\Huntfeed\WebSub\WebSubSubscriber;
use Hosseinhunta\Huntfeed\WebSub\WebSubHandler;
use Hosseinhunta\Huntfeed\Transport\FeedFetcher;
// Get request data
$method = $_SERVER['REQUEST_METHOD'];
$query = $_GET;
$body = file_get_contents('php://input');
// Get headers
$headers = [];
foreach ($_SERVER as $key => $value) {
if (strpos($key, 'HTTP_') === 0) {
$headerName = str_replace('HTTP_', '', $key);
$headerName = str_replace('_', '-', $headerName);
$headers[$headerName] = $value;
}
}
// Create components
$fetcher = new FeedFetcher();
$subscriber = new WebSubSubscriber(
$fetcher,
'https://your-domain.com/websub-callback.php'
);
$handler = new WebSubHandler($subscriber);
// Set up notification processing
$handler->onNotification(function($notification) {
// Process incoming feed items
$pdo = new PDO('mysql:host=localhost;dbname=feeds', 'username', 'password');
foreach ($notification['items'] as $item) {
$stmt = $pdo->prepare('INSERT INTO feed_items (title, link, content, published_at) VALUES (?, ?, ?, ?)');
$stmt->execute([
$item['title'],
$item['link'],
$item['description'] ?? '',
$item['pubDate'] ?? date('Y-m-d H:i:s')
]);
// Send notification via email, Slack, etc.
sendSlackMessage("New item: {$item['title']}");
}
});
// Process the request
$result = $handler->processRequest($method, $query, $body, $headers);
// Send response
http_response_code($result['status'] ?? 200);
if (isset($result['body'])) {
echo $result['body'];
}
// Optional: Log the request
file_put_contents('websub.log', date('Y-m-d H:i:s') . " - {$method} - " . json_encode($result) . "\n", FILE_APPEND);
static
public
generateIntegrationCode()
generateIntegrationCode(): string
string
Generates integration code for your application. Returns PHP code to integrate handler.
Generated Code
<?php
// Add this to your public HTTP endpoint (e.g., /websub-callback.php)
// Get request data
$method = $_SERVER['REQUEST_METHOD'];
$query = $_GET;
$body = file_get_contents('php://input');
// Get headers
$headers = [];
foreach ($_SERVER as $key => $value) {
if (strpos($key, 'HTTP_') === 0) {
$headerName = str_replace('HTTP_', '', $key);
$headerName = str_replace('_', '-', $headerName);
$headers[$headerName] = $value;
}
}
// Create handler and process
$subscriber = new WebSubSubscriber($fetcher, 'http://your-domain.com/websub-callback.php');
$handler = new WebSubHandler($subscriber);
$handler->onNotification(function($notification) {
// Process incoming notification
// $notification contains 'items_count' and 'items'
// Add items to your database or feed manager
foreach ($notification['items'] as $item) {
// Save to database or process
}
});
$result = $handler->processRequest($method, $query, $body, $headers);
// Send response
http_response_code($result['status'] ?? 200);
echo $result['body'] ?? '';
Private Methods
-
handleVerification(array $params): arrayprivate Handle verification challenge from hub -
handleNotification(string $body, array $headers): arrayprivate Handle push notification from hub
- Always validate HMAC signatures when secret is provided
- Rate limit your endpoint to prevent abuse
- Log all incoming requests for debugging
- Use HTTPS in production
- Implement IP whitelisting if possible
UpdateDetector
Detects new items in feeds by comparing with known fingerprints
- final class UpdateDetector
The UpdateDetector class detects new items by comparing current items with known fingerprints. Used internally by FeedScheduler and FeedFetcher.
use Hosseinhunta\Huntfeed\Engine\UpdateDetector;
// Create update detector
$detector = new UpdateDetector();
// Detect new items
$newItems = $detector->detect($currentItems, $knownFingerprints);
Key Methods
public
detect()
detect(array $current, array $knownFingerprints): array
FeedItem[]
Detects new items by comparing current items with known fingerprints.
Parameters
| Parameter | Type | Description |
|---|---|---|
$current |
FeedItem[] | Current feed items |
$knownFingerprints |
string[] | Array of known item fingerprints |
Example
// Example usage in feed update process
$detector = new UpdateDetector();
// Get current feed items
$currentItems = $feed->items();
// Get known fingerprints from previous fetch
$knownFingerprints = loadFingerprintsFromDatabase($feedId);
// Detect new items
$newItems = $detector->detect($currentItems, $knownFingerprints);
if (!empty($newItems)) {
echo "Found " . count($newItems) . " new items\n";
// Process new items
foreach ($newItems as $item) {
processNewItem($item);
// Update known fingerprints
$knownFingerprints[] = $item->fingerprint();
}
// Save updated fingerprints
saveFingerprintsToDatabase($feedId, $knownFingerprints);
} else {
echo "No new items found\n";
}
// This is similar to how FeedFetcher::hasNewItems() works internally
$hasNewItems = $fetcher->hasNewItems($oldFeed, $newFeed);
if ($hasNewItems) {
$newItemsFeed = $fetcher->getNewItems($oldFeed, $newFeed);
}
For efficient update detection, store fingerprints in a database or cache. This allows quick comparison without storing entire feed items.
FeedExporter
Exports feeds to various formats (JSON, RSS, Atom, CSV, HTML, etc.)
- final class FeedExporter
The FeedExporter class provides static methods to export Feed objects to various formats for different use cases.
use Hosseinhunta\Huntfeed\Hub\FeedExporter;
// All methods are static
$json = FeedExporter::toJSON($feed);
$rss = FeedExporter::toRSS($feed);
$html = FeedExporter::toHTML($feed);
Static Methods
static
public
toJSON()
toJSON(Feed $feed, bool $pretty = true): string
string
Exports feed to JSON format. Suitable for APIs.
Example
// Export to JSON
$json = FeedExporter::toJSON($feed, true); // Pretty print
// For API response
header('Content-Type: application/json');
echo $json;
// Or without pretty print (smaller size)
$json = FeedExporter::toJSON($feed, false);
// JSON structure
{
"url": "https://example.com/feed.xml",
"title": "Example Feed",
"items_count": 10,
"items": [
{
"id": "item-123",
"title": "Example Article",
"link": "https://example.com/article",
"content": "Article content...",
"enclosure": null,
"publishedAt": "2024-01-15T10:30:00+00:00",
"category": "Technology",
"extra": {},
"fingerprint": "sha256hash..."
}
]
}
static
public
toRSS()
toRSS(Feed $feed): string
string
Exports feed to RSS 2.0 format.
Example
// Export to RSS
$rss = FeedExporter::toRSS($feed);
// For RSS feed endpoint
header('Content-Type: application/rss+xml');
echo $rss;
// Save to file
file_put_contents('export.rss', $rss);
// RSS output includes:
// - Channel information
// - All items with title, link, description, pubDate, guid
// - Categories and enclosures if present
// - Atom namespace for compatibility
static
public
toAtom()
toAtom(Feed $feed): string
string
Exports feed to Atom format.
static
public
toJSONFeed()
toJSONFeed(Feed $feed): string
string
Exports feed to JSON Feed format (https://jsonfeed.org).
static
public
toCSV()
toCSV(Feed $feed, bool $withHeader = true): string
string
Exports feed to CSV format for Excel or spreadsheet import.
Example
// Export to CSV
$csv = FeedExporter::toCSV($feed, true); // With header row
// For download
header('Content-Type: text/csv');
header('Content-Disposition: attachment; filename="feed-export.csv"');
echo $csv;
// CSV format:
// ID,Title,Link,Category,Published Date,Has Content,Has Enclosure
// "item-123","Example Article","https://example.com/article","Technology","2024-01-15 10:30:00","Yes","No"
// ...
static
public
toHTML()
toHTML(Feed $feed): string
string
Exports feed to HTML format for web display.
Example
// Export to HTML
$html = FeedExporter::toHTML($feed);
// Output directly
echo $html;
// Or save to file
file_put_contents('feed-export.html', $html);
// HTML includes:
// - Full HTML document with CSS styles
// - Feed header with title and URL
// - All items in styled divs
// - Responsive design
// - Clean, readable presentation
static
public
toText()
toText(Feed $feed): string
string
Exports feed to plain text format.
Example
// Export to plain text
$text = FeedExporter::toText($feed);
// For email or plain text display
echo $text;
// Text format:
// Feed: Example Feed
// URL: https://example.com/feed.xml
// Items: 10
// ================================================
//
// Title: Example Article
// Link: https://example.com/article
// Published: 2024-01-15 10:30:00
// Category: Technology
// Content: Article content excerpt...
// -------------------------------------------------
static
private
escapeCsvField()
escapeCsvField(string $field): string
Escapes CSV field properly (private helper method).
- JSON: For APIs and JavaScript applications
- RSS/Atom: For feed readers and syndication
- CSV: For Excel analysis and data import
- HTML: For web display and embedding
- Text: For email notifications and logs
- JSON Feed: For modern web applications
PollingTransport
Simple HTTP transport for fetching feed content
- final class PollingTransport
The PollingTransport class provides a simple HTTP transport using file_get_contents() with stream context. It's a lightweight alternative to FeedFetcher.
use Hosseinhunta\Huntfeed\Transport\PollingTransport;
// Create polling transport
$transport = new PollingTransport();
// Fetch feed content
try {
$xml = $transport->fetch('https://example.com/feed.xml');
echo "Fetched " . strlen($xml) . " bytes\n";
} catch (RuntimeException $e) {
echo "Fetch failed: " . $e->getMessage() . "\n";
}
Key Methods
public
fetch()
fetch(string $url): string
string
Fetches content from URL using file_get_contents() with stream context.
Parameters
| Parameter | Type | Description |
|---|---|---|
$url |
string | URL to fetch |
Configuration
The transport uses the following stream context options:
timeout: 10 secondsuser_agent: 'Huntfeed/1.0'
Example
// Simple fetch example
$transport = new PollingTransport();
$urls = [
'https://example.com/feed1.xml',
'https://example.com/feed2.rss',
'https://example.com/feed3.json'
];
foreach ($urls as $url) {
try {
$content = $transport->fetch($url);
// Parse with appropriate parser
if (str_contains($content, 'parse($content, $url);
echo "Fetched: {$feed->title} ({$feed->itemsCount()} items)\n";
} catch (RuntimeException $e) {
echo "Failed to fetch {$url}: " . $e->getMessage() . "\n";
}
}
Exceptions
| Exception | Condition |
|---|---|
RuntimeException |
If file_get_contents() returns false |
- PollingTransport: Simple, lightweight, no cURL dependency
- FeedFetcher: Full-featured, SSL verification, redirects, custom headers, timeout control
- Use PollingTransport for simple scripts or when cURL is not available
- Use FeedFetcher for production applications with robust requirements
Event System
Event-driven architecture for real-time notifications
HuntFeed uses an event-driven architecture to notify your application of important events. Events are triggered automatically and can be handled with callbacks registered via FeedManager::on().
// Register event handlers
$manager->on('feed:registered', function($data) {
echo "New feed registered: {$data['feedId']}\n";
});
$manager->on('item:new', function($data) {
$item = $data['item'];
echo "New item: {$item->title}\n";
});
Available Events
Triggered when a new feed is registered
Triggered when a feed is updated with new items
Triggered when a new item is detected
Triggered when a feed is removed
Triggered when a feed is force-updated
Triggered when a duplicate item is detected (internal)
WebSub-Specific Events
Successful subscription to hub
Successful unsubscription
Incoming push notification
Hub verification challenge
WebSub-related error
Event Data Structure
// Example event data structure
$eventData = [
'feedId' => 'hacker_news', // Feed identifier
'feed' => Feed, // Feed object (if applicable)
'item' => FeedItem, // Item object (if applicable)
'url' => string, // Feed URL
'new_items_count' => int, // Number of new items (for feed:updated)
'timestamp' => DateTimeImmutable, // Event timestamp
'metadata' => [/* additional data */]
];
// WebSub events have additional data
$websubEventData = [
'feed_url' => string,
'hub_url' => string,
'callback_url' => string,
'verified' => bool,
'lease_seconds' => int,
'subscription_id' => string,
'items' => array, // For websub:notification
'challenge' => string, // For websub:verification
'error' => string // For websub:error
];
Registering Event Handlers
// Comprehensive event handling example
$manager->on('feed:registered', function($data) {
$feedId = $data['feedId'];
$url = $data['url'];
// Log to database
$pdo->prepare('INSERT INTO feed_log (feed_id, event, timestamp) VALUES (?, ?, ?)')
->execute([$feedId, 'registered', date('Y-m-d H:i:s')]);
// Send notification
sendSlackMessage("📥 New feed registered: {$feedId}\nURL: {$url}");
});
$manager->on('item:new', function($data) {
$item = $data['item'];
$feedId = $data['feedId'];
// Multiple actions for new items
$actions = [
'database' => function() use ($item) {
// Save to database
saveToDatabase($item);
},
'notification' => function() use ($item, $feedId) {
// Send notifications
sendEmailNotification($item);
sendPushNotification($item);
sendSlackMessage("📰 New item in {$feedId}: {$item->title}");
},
'analytics' => function() use ($feedId) {
// Update analytics
incrementCounter("feed_{$feedId}_new_items");
},
'cache' => function() use ($feedId) {
// Clear relevant caches
clearFeedCache($feedId);
}
];
// Execute all actions
foreach ($actions as $action) {
try {
$action();
} catch (Exception $e) {
error_log("Action failed: " . $e->getMessage());
}
}
});
$manager->on('feed:updated', function($data) {
$feedId = $data['feedId'];
$count = $data['new_items_count'];
// Update statistics
updateFeedStatistics($feedId, $count);
// Log update
file_put_contents('updates.log',
date('Y-m-d H:i:s') . " - {$feedId} - {$count} new items\n",
FILE_APPEND
);
// Trigger additional processing if many new items
if ($count > 10) {
triggerBatchProcessing($feedId);
}
});
// WebSub event handling
$webSubManager->getHandler()->onNotification(function($notification) {
// Process WebSub notification
$items = $notification['items'];
foreach ($items as $item) {
// Mark as from WebSub
$item['source'] = 'websub';
// Process
processWebSubItem($item);
}
// Update WebSub statistics
incrementCounter('websub_notifications_received');
updateLastWebSubActivity();
});
Multiple Handlers
You can register multiple handlers for the same event:
// Multiple handlers for the same event
$manager->on('item:new', function($data) {
// Handler 1: Database
saveToDatabase($data['item']);
});
$manager->on('item:new', function($data) {
// Handler 2: Notifications
sendNotifications($data['item']);
});
$manager->on('item:new', function($data) {
// Handler 3: Analytics
updateAnalytics($data['feedId']);
});
$manager->on('item:new', function($data) {
// Handler 4: Logging
logNewItem($data['item']);
});
// All handlers will be called when 'item:new' event is triggered
// Order of execution is the same as registration order
Error Handling in Event Handlers
// Robust event handling with error management
$manager->on('item:new', function($data) {
try {
// Main processing
$item = $data['item'];
$feedId = $data['feedId'];
// Complex processing with error handling
$success = processItemWithRetry($item, 3); // 3 retries
if (!$success) {
throw new Exception("Failed to process item after retries");
}
// Log success
logSuccess($feedId, $item->id);
} catch (Exception $e) {
// Log error but don't break other handlers
error_log("Event handler error: " . $e->getMessage());
// Optionally notify admin
sendAdminAlert("Event handler failed for {$data['feedId']}", $e);
// Continue - don't rethrow
}
});
// Separate error handling events
$manager->on('error', function($error) {
// Centralized error handling
logError($error);
notifyAdmin($error);
incrementErrorCounter($error['type']);
});
- Keep event handlers focused on single responsibilities
- Implement proper error handling in each handler
- Use try-catch blocks to prevent one handler from breaking others
- Consider performance implications of many handlers
- Log all important events for debugging and auditing
- Use events for decoupled, maintainable code architecture