Ruqqus public dataset
Moderators: MadWorld, kestrel9
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Ruqqus public dataset
Public notes
First line '[title](url)' of the md plaintext(item_hpt) of submissions is the title of submission. The rest is the body of submission.
The attribute 'item_id_alias' is specific to each type (submission, mod list, mod log, etc) of content and is not unique among types.
The modlog by ruqqus api has 31396 entries. The modlog on record has 71174 entries. The official modlog is missing 39778 entries (56%).
This version of release is mainly composed of the plaintext in markdown format. It includes the dataset from pre-purge period. Whatever the admins had purged should mostly be there (I hope). Ruqqus cut the cord way too early than promised, making its site unusable. Furthermore, it appeared to have purged more than 50% of the record on modlog, making the latest dataset questionable. As a result, additional info will not be included. It is moo at this point.
Record count
submissions: 550271
mods: 8367
modlogs: 71174 (31396 on official record)
users: 12363 (excluding commenters)
Attributes
submissions: item_id_alias, item_url, author, pub_date, guild, upvote, downvote, score, comment_count, item_hpt_ver, item_hpt
mods: item_id_alias, guild, author, pub_date, mod_type, permissions, item_hpt_ver, item_hpt
modlogs: item_id_alias, item_url, author, pub_date_est, guild, item_hpt_ver, item_hpt
Files
https://archive.org/download/ruqqus-pub ... 10-14.json
https://archive.org/download/ruqqus-pub ... 10-14.json
https://archive.org/download/ruqqus-pub ... 10-14.json
@SearchVoat, I hope this could be made searchable on SVF. Thank you!!
Edit: it is possible that modlogs became hidden, when guilds were set to private or ban. It would not be efficient to delete modlogs individually.
First line '[title](url)' of the md plaintext(item_hpt) of submissions is the title of submission. The rest is the body of submission.
The attribute 'item_id_alias' is specific to each type (submission, mod list, mod log, etc) of content and is not unique among types.
The modlog by ruqqus api has 31396 entries. The modlog on record has 71174 entries. The official modlog is missing 39778 entries (56%).
This version of release is mainly composed of the plaintext in markdown format. It includes the dataset from pre-purge period. Whatever the admins had purged should mostly be there (I hope). Ruqqus cut the cord way too early than promised, making its site unusable. Furthermore, it appeared to have purged more than 50% of the record on modlog, making the latest dataset questionable. As a result, additional info will not be included. It is moo at this point.
Record count
submissions: 550271
mods: 8367
modlogs: 71174 (31396 on official record)
users: 12363 (excluding commenters)
Attributes
submissions: item_id_alias, item_url, author, pub_date, guild, upvote, downvote, score, comment_count, item_hpt_ver, item_hpt
mods: item_id_alias, guild, author, pub_date, mod_type, permissions, item_hpt_ver, item_hpt
modlogs: item_id_alias, item_url, author, pub_date_est, guild, item_hpt_ver, item_hpt
Files
https://archive.org/download/ruqqus-pub ... 10-14.json
https://archive.org/download/ruqqus-pub ... 10-14.json
https://archive.org/download/ruqqus-pub ... 10-14.json
@SearchVoat, I hope this could be made searchable on SVF. Thank you!!
Edit: it is possible that modlogs became hidden, when guilds were set to private or ban. It would not be efficient to delete modlogs individually.
Last edited by MadWorld on Sat Oct 16, 2021 1:20 am, edited 1 time in total.
- antiliberalsociety
- Posts: 2633
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 3394
- Reply points (CCP): 4462
Re: Ruqqus public dataset
Great work! They're trying to shoah their seedy past.MadWorld wrote: ↑Fri Oct 15, 2021 9:57 pm Public notes
First line '[title](url)' of the md plaintext(item_hpt) of submissions is the title of submission. The rest is the body of submission.
The attribute 'item_id_alias' is specific to each type (submission, mod list, mod log, etc) of content and is not unique among types.
The modlog by ruqqus api has 31396 entries. The modlog on record has 71174 entries. The official modlog is missing 39778 entries (56%).
This version of release is mainly composed of the plaintext in markdown format. It includes the dataset from pre-purge period. Whatever the admins had purged should mostly be there (I hope). Ruqqus cut the cord way too early than promised, making its site unusable. Furthermore, it appeared to have purged more than 50% of the record on modlog, making the latest dataset questionable. As a result, additional info will not be included. It is moo at this point.
Record count
submissions: 550271
mods: 8367
modlogs: 71174 (31396 on official record)
users: 12363 (excluding commenters)
Attributes
submissions: item_id_alias, item_url, author, pub_date, guild, upvote, downvote, score, comment_count, item_hpt_ver, item_hpt
mods: item_id_alias, guild, author, pub_date, mod_type, permissions, item_hpt_ver, item_hpt
modlogs: item_id_alias, item_url, author, pub_date_est, guild, item_hpt_ver, item_hpt
Files
https://archive.org/download/ruqqus-pub ... 10-14.json
https://archive.org/download/ruqqus-pub ... 10-14.json
https://archive.org/download/ruqqus-pub ... 10-14.json
@SearchVoat, I hope this could be made searchable on SVF. Thank you!!
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Re: Ruqqus public dataset
Their site is barely functional at this point. You could try to access its https://api.ruqqus.com. But as soon as 3 to 5 requests were made, it would start throwing errors. I think it only works, because very few people know that subdomain.antiliberalsociety wrote: ↑Sat Oct 16, 2021 1:18 am Great work! They're trying to shoah their seedy past.
- antiliberalsociety
- Posts: 2633
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 3394
- Reply points (CCP): 4462
Re: Ruqqus public dataset
The ironyMadWorld wrote: ↑Sat Oct 16, 2021 1:23 amTheir site is barely functional at this point. You could try to access its https://api.ruqqus.com. But as soon as 3 to 5 requests were made, it would start throwing errors. I think it only works, because very few people know that subdomain.antiliberalsociety wrote: ↑Sat Oct 16, 2021 1:18 am Great work! They're trying to shoah their seedy past.
https://api.ruqqus.com/+Ruqqus/post/doe ... oing/15p4wJoeMcCarthy · 3 days ago · Edited 3 days ago
They can always go to your site and get censorship. So there's that. But the question is: why would they want to?
You're also a suspicious character quite frankly. You spent God knows how much time at Discussions digging through months of Faust Alexander's posts with the specific intent of smearing him. You put the kind of effort into it that indicates you have an awful lot of time on your hands. Or you are paid to do this kind of stuff. Either way - not a good look. I mean, I never paid all that much attention to you. You get a lot of attention as it is - which you obviously enjoy. But that move got my attention. Because of what it indicates about you. And I don't recommend joining sites run by possible glowie types.
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Re: Ruqqus public dataset
Last update on stats of ruqqus.
Record on ruqqus's dataset via api
submission count: 500619 (at least 50K entries not shown via api)
comment count: 1636417
submission count by guild: https://files.catbox.moe/xi5qxt.txt
comment count by guild: https://files.catbox.moe/w3j6f7.txt
i.ruqqus.com: 573569 (100GB+)
Note on the dataset by api. At least one guild (+general submissions) was not available on api, but was visible by web page. It was supposedly in unfiltered setting; yet, some data remained hidden. As @antiliberalsociety has noticed, they appeared to have already unplugged the api.ruqqus.com subdomain. Surprisingly, some entries from HitlerWasRight showed up on api.
541 HitlerWasRight submissions.
4306 HitlerWasRight comments.
The only things still work right now are its static home page and media on i.ruqqus.com subdomain. I can make the json data available. But the data on i.ruqqus.com subdomain is probably not worth uploading.
Edit: fixed links.
Record on ruqqus's dataset via api
submission count: 500619 (at least 50K entries not shown via api)
comment count: 1636417
submission count by guild: https://files.catbox.moe/xi5qxt.txt
comment count by guild: https://files.catbox.moe/w3j6f7.txt
i.ruqqus.com: 573569 (100GB+)
Note on the dataset by api. At least one guild (+general submissions) was not available on api, but was visible by web page. It was supposedly in unfiltered setting; yet, some data remained hidden. As @antiliberalsociety has noticed, they appeared to have already unplugged the api.ruqqus.com subdomain. Surprisingly, some entries from HitlerWasRight showed up on api.
541 HitlerWasRight submissions.
4306 HitlerWasRight comments.
The only things still work right now are its static home page and media on i.ruqqus.com subdomain. I can make the json data available. But the data on i.ruqqus.com subdomain is probably not worth uploading.
Edit: fixed links.
Last edited by MadWorld on Sat Nov 06, 2021 4:14 am, edited 1 time in total.
- antiliberalsociety
- Posts: 2633
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 3394
- Reply points (CCP): 4462
Re: Ruqqus public dataset
They really created a monsterMadWorld wrote: ↑Fri Nov 05, 2021 6:36 pm Last update on stats of ruqqus.
Record on ruqqus's dataset via api
submission count: 500619 (at least 50K entries not shown via api)
comment count: 1636417
submission count by guild: https://files.catbox.moe/w3j6f7.txt
comment count by guild: https://files.catbox.moe/xi5qxt.txt
i.ruqqus.com: 573569 (100GB+)
Note on the dataset by api. At least one guild (+general submissions) was not available on api, but was visible by web page. It was supposedly in unfiltered setting; yet, some data remained hidden. As @antiliberalsociety has noticed, they appeared to have already unplugged the api.ruqqus.com subdomain. Surprisingly, some entries from HitlerWasRight showed up on api.
541 HitlerWasRight submissions.
4306 HitlerWasRight comments.
The only things still work right now are its static home page and media on i.ruqqus.com subdomain. I can make the json data available. But the data on i.ruqqus.com subdomain is probably not worth uploading.
- SearchVoat
- Posts: 440
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 298
- Reply points (CCP): 795
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Re: Ruqqus public dataset
o7 Thank you for doing this!!!
- antiliberalsociety
- Posts: 2633
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 3394
- Reply points (CCP): 4462
Re: Ruqqus public dataset
I LOVE YOU!!
No homo
If you could get the comments to go with the posts...
Last edited by antiliberalsociety on Sat Nov 06, 2021 1:30 pm, edited 1 time in total.