Happy to do that if someone can get hold of the data. According to @MadWorld above it looks like it's gone.antiliberalsociety wrote: ↑Sat Nov 06, 2021 1:12 am If you could get the comments to go with the posts...
Ruqqus public dataset
Moderators: MadWorld, kestrel9
- SearchVoat
- Posts: 440
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 298
- Reply points (CCP): 795
Re: Ruqqus public dataset
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Re: Ruqqus public dataset
We got lucky. Whatever was available via api got archived just in time. That means comments and i.ruqqus.com data, too. I have not gotten around to upload it yet.SearchVoat wrote: ↑Sun Nov 07, 2021 6:22 amHappy to do that if someone can get hold of the data. According to @MadWorld above it looks like it's gone.antiliberalsociety wrote: ↑Sat Nov 06, 2021 1:12 am If you could get the comments to go with the posts...
Edit: if you could present it the same way you had done with searchvoat.co/v/[subverse]/[submission], that would be fantastic!! It would make the page view seamless, with respect to ruqqus.com.
Last edited by MadWorld on Sun Nov 07, 2021 7:28 am, edited 1 time in total.
- SearchVoat
- Posts: 440
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 298
- Reply points (CCP): 795
Re: Ruqqus public dataset
I'll do that after we have the comments.
btw I have the thumbnails from i.ruqqus.com/posts/[item_id_alias]/thumb.png - is there other i.ruqqus.com data?
- antiliberalsociety
- Posts: 2633
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 3394
- Reply points (CCP): 4462
Re: Ruqqus public dataset
MadWorld wrote: ↑Sun Nov 07, 2021 7:26 amWe got lucky. Whatever was available via api got archived just in time. That means comments and i.ruqqus.com data, too. I have not gotten around to upload it yet.SearchVoat wrote: ↑Sun Nov 07, 2021 6:22 amHappy to do that if someone can get hold of the data. According to @MadWorld above it looks like it's gone.antiliberalsociety wrote: ↑Sat Nov 06, 2021 1:12 am If you could get the comments to go with the posts...
Edit: if you could present it the same way you had done with searchvoat.co/v/[subverse]/[submission], that would be fantastic!! It would make the page view seamless, with respect to ruqqus.com.
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Re: Ruqqus public dataset
I will upload the raw data from api and send you the link. The submissions are already quite readable, nicely done!! The i.ruqqus.com data has 100GB+ in size. It is mainly consisted of the thumbnails, the full view, and other images linked in either submissions or comments. If ruqqus shuts down this subdomain, we will still have the data.SearchVoat wrote: ↑Sun Nov 07, 2021 8:28 amI'll do that after we have the comments.
btw I have the thumbnails from i.ruqqus.com/posts/[item_id_alias]/thumb.png - is there other i.ruqqus.com data?
- SearchVoat
- Posts: 440
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 298
- Reply points (CCP): 795
Re: Ruqqus public dataset
I don’t have enough storage for that. I shrunk the thumbnails to make them fit. Consider a 3rd party image host. Can accommodate the comments though (probably).
Last edited by SearchVoat on Sun Nov 07, 2021 8:17 pm, edited 1 time in total.
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Re: Ruqqus public dataset
The i.ruqqus.com is optional, since you already have the thumbnails in place. The comments are around 6.5GB in text format without compression. But the actual size on the essential info should be a bit smaller.SearchVoat wrote: ↑Sun Nov 07, 2021 8:16 pmI don’t have enough storage for that. I shrunk the thumbnails to make them fit. Consider a 3rd party image host. Can accommodate the comments though (probably).
- antiliberalsociety
- Posts: 2633
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 3394
- Reply points (CCP): 4462
Re: Ruqqus public dataset
This is why when they implemented site hosted images, I stuck with catbox. It was offered as "premium" to users who earned reddit, I mean Ruqqus Gold™ To which I would say thanks, but don't support this censorious site. All too many people took the bait. I knew they'd shut down one day & all those images would go with it.MadWorld wrote: ↑Sun Nov 07, 2021 8:30 pmThe i.ruqqus.com is optional, since you already have the thumbnails in place. The comments are around 6.5GB in text format without compression. But the actual size on the essential info should be a bit smaller.SearchVoat wrote: ↑Sun Nov 07, 2021 8:16 pmI don’t have enough storage for that. I shrunk the thumbnails to make them fit. Consider a 3rd party image host. Can accommodate the comments though (probably).
At least some were smart enough to save it
- MadWorld
- Posts: 1229
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 1276
- Reply points (CCP): 2987
Re: Ruqqus public dataset
file: comments.fx.7z 6.2 GB compressed to 275 MB.
Comments with duplicates removed, sorted by base36 comment id. The api data included quite a bit of info, such as user's stat and guild's setting. You could create a template out of ruqqus's static page and plug in the info available. It would be hilarious to see a near-identical page view on SearchVoat page.
You could even use "searchvoat.co/ruqqus/[original url without domain name]" to view SearchVoat's version of data.
Thank you, @SearchVoat!! We love you!! (also no homo )
Edit: id of parent comment is also available for level value greater than 1.
Comments with duplicates removed, sorted by base36 comment id. The api data included quite a bit of info, such as user's stat and guild's setting. You could create a template out of ruqqus's static page and plug in the info available. It would be hilarious to see a near-identical page view on SearchVoat page.
You could even use "searchvoat.co/ruqqus/[original url without domain name]" to view SearchVoat's version of data.
Thank you, @SearchVoat!! We love you!! (also no homo )
Edit: id of parent comment is also available for level value greater than 1.
Last edited by MadWorld on Mon Nov 08, 2021 9:47 pm, edited 1 time in total.
- antiliberalsociety
- Posts: 2633
- Joined: Wed Dec 23, 2020 2:00 am
- Topic points (SCP): 3394
- Reply points (CCP): 4462
Re: Ruqqus public dataset
As Puttitout once told me...MadWorld wrote: ↑Mon Nov 08, 2021 9:29 pm file: comments.fx.7z 6.2 GB compressed to 275 MB.
Comments with duplicates removed, sorted by base36 comment id. The api data included quite a bit of info, such as user's stat and guild's setting. You could create a template out of ruqqus's static page and plug in the info available. It would be hilarious to see a near-identical page view on SearchVoat page.
You could even use "searchvoat.co/ruqqus/[original url without domain name]" to view SearchVoat's version of data.
Thank you, @SearchVoat!! We love you!! (also no homo )
Edit: id of parent comment is also available for level value greater than 1.
God Bless you goat!