Ruqqus public dataset

A subverse to post things for posterity.

Moderators: MadWorld, kestrel9

User avatar
SearchVoat
Posts: 440
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 298
Reply points (CCP): 795

Re: Ruqqus public dataset

Post by SearchVoat »

antiliberalsociety wrote: Sat Nov 06, 2021 1:12 am If you could get the comments to go with the posts...
Happy to do that if someone can get hold of the data. According to @MadWorld above it looks like it's gone.
User avatar
MadWorld
Posts: 1229
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 1276
Reply points (CCP): 2987

Re: Ruqqus public dataset

Post by MadWorld »

SearchVoat wrote: Sun Nov 07, 2021 6:22 am
antiliberalsociety wrote: Sat Nov 06, 2021 1:12 am If you could get the comments to go with the posts...
Happy to do that if someone can get hold of the data. According to @MadWorld above it looks like it's gone.
We got lucky. Whatever was available via api got archived just in time. :lol: That means comments and i.ruqqus.com data, too. I have not gotten around to upload it yet.

Edit: if you could present it the same way you had done with searchvoat.co/v/[subverse]/[submission], that would be fantastic!! It would make the page view seamless, with respect to ruqqus.com.
Last edited by MadWorld on Sun Nov 07, 2021 7:28 am, edited 1 time in total.
User avatar
SearchVoat
Posts: 440
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 298
Reply points (CCP): 795

Re: Ruqqus public dataset

Post by SearchVoat »

MadWorld wrote: Sun Nov 07, 2021 7:26 am if you could present it the same way you had done with searchvoat.co/v/[subverse]/[submission], that would be fantastic!! It would make the page view seamless, with respect to ruqqus.com.
I'll do that after we have the comments.

btw I have the thumbnails from i.ruqqus.com/posts/[item_id_alias]/thumb.png - is there other i.ruqqus.com data?
User avatar
antiliberalsociety
Posts: 2633
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 3394
Reply points (CCP): 4462

Re: Ruqqus public dataset

Post by antiliberalsociety »

MadWorld wrote: Sun Nov 07, 2021 7:26 am
SearchVoat wrote: Sun Nov 07, 2021 6:22 am
antiliberalsociety wrote: Sat Nov 06, 2021 1:12 am If you could get the comments to go with the posts...
Happy to do that if someone can get hold of the data. According to @MadWorld above it looks like it's gone.
We got lucky. Whatever was available via api got archived just in time. :lol: That means comments and i.ruqqus.com data, too. I have not gotten around to upload it yet.

Edit: if you could present it the same way you had done with searchvoat.co/v/[subverse]/[submission], that would be fantastic!! It would make the page view seamless, with respect to ruqqus.com.
Image
User avatar
MadWorld
Posts: 1229
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 1276
Reply points (CCP): 2987

Re: Ruqqus public dataset

Post by MadWorld »

SearchVoat wrote: Sun Nov 07, 2021 8:28 am
MadWorld wrote: Sun Nov 07, 2021 7:26 am if you could present it the same way you had done with searchvoat.co/v/[subverse]/[submission], that would be fantastic!! It would make the page view seamless, with respect to ruqqus.com.
I'll do that after we have the comments.

btw I have the thumbnails from i.ruqqus.com/posts/[item_id_alias]/thumb.png - is there other i.ruqqus.com data?
I will upload the raw data from api and send you the link. The submissions are already quite readable, nicely done!! The i.ruqqus.com data has 100GB+ in size. It is mainly consisted of the thumbnails, the full view, and other images linked in either submissions or comments. If ruqqus shuts down this subdomain, we will still have the data.
User avatar
SearchVoat
Posts: 440
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 298
Reply points (CCP): 795

Re: Ruqqus public dataset

Post by SearchVoat »

MadWorld wrote: Sun Nov 07, 2021 8:03 pm The i.ruqqus.com data has 100GB+ in size.
I don’t have enough storage for that. I shrunk the thumbnails to make them fit. Consider a 3rd party image host. Can accommodate the comments though (probably).
Last edited by SearchVoat on Sun Nov 07, 2021 8:17 pm, edited 1 time in total.
User avatar
MadWorld
Posts: 1229
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 1276
Reply points (CCP): 2987

Re: Ruqqus public dataset

Post by MadWorld »

SearchVoat wrote: Sun Nov 07, 2021 8:16 pm
MadWorld wrote: Sun Nov 07, 2021 8:03 pm The i.ruqqus.com data has 100GB+ in size.
I don’t have enough storage for that. I shrunk the thumbnails to make them fit. Consider a 3rd party image host. Can accommodate the comments though (probably).
The i.ruqqus.com is optional, since you already have the thumbnails in place. The comments are around 6.5GB in text format without compression. But the actual size on the essential info should be a bit smaller.
User avatar
antiliberalsociety
Posts: 2633
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 3394
Reply points (CCP): 4462

Re: Ruqqus public dataset

Post by antiliberalsociety »

MadWorld wrote: Sun Nov 07, 2021 8:30 pm
SearchVoat wrote: Sun Nov 07, 2021 8:16 pm
MadWorld wrote: Sun Nov 07, 2021 8:03 pm The i.ruqqus.com data has 100GB+ in size.
I don’t have enough storage for that. I shrunk the thumbnails to make them fit. Consider a 3rd party image host. Can accommodate the comments though (probably).
The i.ruqqus.com is optional, since you already have the thumbnails in place. The comments are around 6.5GB in text format without compression. But the actual size on the essential info should be a bit smaller.
This is why when they implemented site hosted images, I stuck with catbox. It was offered as "premium" to users who earned reddit, I mean Ruqqus Gold™ To which I would say thanks, but don't support this censorious site. All too many people took the bait. I knew they'd shut down one day & all those images would go with it.

At least some were smart enough to save it 😂
User avatar
MadWorld
Posts: 1229
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 1276
Reply points (CCP): 2987

Re: Ruqqus public dataset

Post by MadWorld »

file: comments.fx.7z 6.2 GB compressed to 275 MB.

Comments with duplicates removed, sorted by base36 comment id. The api data included quite a bit of info, such as user's stat and guild's setting. You could create a template out of ruqqus's static page and plug in the info available. :lol: It would be hilarious to see a near-identical page view on SearchVoat page.

You could even use "searchvoat.co/ruqqus/[original url without domain name]" to view SearchVoat's version of data.

Thank you, @SearchVoat!! We love you!! (also no homo :lol: )

Edit: id of parent comment is also available for level value greater than 1.
Last edited by MadWorld on Mon Nov 08, 2021 9:47 pm, edited 1 time in total.
User avatar
antiliberalsociety
Posts: 2633
Joined: Wed Dec 23, 2020 2:00 am
Topic points (SCP): 3394
Reply points (CCP): 4462

Re: Ruqqus public dataset

Post by antiliberalsociety »

MadWorld wrote: Mon Nov 08, 2021 9:29 pm file: comments.fx.7z 6.2 GB compressed to 275 MB.

Comments with duplicates removed, sorted by base36 comment id. The api data included quite a bit of info, such as user's stat and guild's setting. You could create a template out of ruqqus's static page and plug in the info available. :lol: It would be hilarious to see a near-identical page view on SearchVoat page.

You could even use "searchvoat.co/ruqqus/[original url without domain name]" to view SearchVoat's version of data.

Thank you, @SearchVoat!! We love you!! (also no homo :lol: )

Edit: id of parent comment is also available for level value greater than 1.
As Puttitout once told me...

God Bless you goat!
Post Reply