mysql -u********** -p********** -e "SELECT tags FROM xvideos WHERE MATCH (tags) against ('+Big +(Penis dick cock)' in boolean mode);" > /path/big-penis-tags-less~gay.csv
mysql -u********** -p********** -e "SELECT tags FROM seo.xvideos WHERE MATCH (tags) against (' +Cougar +MILF' in boolean mode);" >/path/tag-study/cougar-tags.csv
SELECT tags FROM seo.xvideos WHERE tags LIKE '%Buttplug%';" >/path/tag-study/buttplug-tags.csv
xvideos distributes a database of all their current data (still?) The database is over 4 million rows and doing these or similar queries is rather slow -- sometimes over 15 seconds here depending on the query's complexity.
Code:
mysql> SHOW CREATE TABLE xvideos;
| xvideos | CREATE TABLE `xvideos` (
`url` char(50) NOT NULL,
`performer` char(30) NOT NULL,
`runtime` char(12) DEFAULT NULL,
`thumb` char(120) DEFAULT NULL,
`iframeCode` char(200) DEFAULT NULL,
`tags` char(200) DEFAULT NULL,
`category` char(60) DEFAULT NULL,
KEY `category` (`category`(10)),
FULLTEXT KEY `idx_1` (`category`),
FULLTEXT KEY `idx_2` (`tags`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
1 row in set (0.00 sec)
mysql> SELECT COUNT(*) FROM xvideos;
+----------+
| COUNT(*) |
+----------+
| 4543335 |
+----------+
1 row in set (0.00 sec)
mysql> SELECT tags FROM seo.xvideos WHERE tags LIKE '%girl%';
602301 rows in set (10.98 sec)
SELECT tags FROM seo.xvideos WHERE category LIKE '%girl%';
169530 rows in set (1.68 sec)
SELECT tags FROM seo.xvideos WHERE MATCH (tags) against (' +Cougar +MILF' in boolean mode);
62322 rows in set (12.35 sec)
But this is of limited value as it is only filtering the word groups used as descriptive by the video's uploader and as such is very subjective. I don't think the costs of designing and developing an AI, or even a 'dumb-bot', schema and program code for the purpose of describing tube videos on traffic blogs and other sites would be worth the expense -- there is not that much money in it -- but it could be done probably.