Artifact metadata

Stored reference for a generated output.

CreatedApr 14, 4:29 PM
RunReport Indexer index run
Reference/artifacts/artifact_gemsearch_72a03240c1ab7bfd
Storageaddyco-artifacts/gemsearch/reports/2026-04-14_16-29-08_github-repos-that-allow-to-search-reddit-posts-and-comments.md
Hash72a03240c1ab7bfd910a9020f75f41dd4198162426b97c7805014bf04d23acf3
Modelgemini-3.1-pro-preview

Preview

Stored report content from Supabase Storage.

# Search: github repos that allow to search reddit posts and comments > Searched on Tuesday, April 14, 2026 using `gemini-3.1-pro-preview` ## Findings Searching for Reddit posts and comments via GitHub repositories largely depends on whether you need **live, current data** (subject to Reddit's strict API rules introduced in 2023) or **historical, archived data** (including deleted/removed content). Following Reddit's decision to restrict its official API and effectively shut down the popular third-party archive *Pushshift* for general use, the open-source community created several new tools and wrappers. Here is a breakdown of the best current GitHub repositories for searching and analyzing Reddit content, categorized by their underlying data sources. ### 1. Modern Archival APIs & Wrappers (Pushshift Alternatives) Because the official Reddit API makes it incredibly difficult to search deep historical data or retrieve massive amounts of comments without hitting rate limits, the community relies on third-party data dumps. * **[ArthurHeitmann/arctic_shift](https://github.com/ArthurHeitmann/arctic_shift)** * **What it is:** One of the most prominent successors to Pushshift. It makes Reddit data accessible through large `.zst` data dumps, a custom API, and a web UI. * **Features:** It actively archives Reddit (maintaining datasets into 2025 and 2026) and provides Python helper scripts to query `.jsonl` and `.zst` files [[1](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGNQNzDj3-Q5hAO9c3zusC8BKyOqHk3MiGSZr30Nt1Haj6eIoiAb4MzVe9Fqvx-OyZxEjq2bVklcgZbbj60H2T3LI3XNofL9SWoqQaAvvnOcWNUI64s-2viK07O_tmw76mYP0mAvqTimdA=), [2](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHp2C1w6yv0LMjQsKeOIQ9rRpOz673-iac1g26I3siySHcFR0uAvDek63vXuKKaKNARRzLQoYq2EX95Io1AJvy9ymeEl1TrJjNOg_pXTdVILu4rvol9xxlr1Z1g4xWrg6rF3xqVIK0MW25YwknyWST3fYU=)]. * **Related UI Repo:** `ArthurHeitmann/arctic_shift_ui` is a Svelte/TypeScript web application built specifically for searching and downloading this archived data via a clean interface [[3](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFrbm3-rRqe7And1WfM7SyBvYiNsiRd6S6ohIH9BxiN3XjEEDKLcwqRU92NahZSataFIHNcczEXR6Nq83TNbFgEf7HJKwxlH48PAkl2zsveoA3qNlTl9TsQo8JaUER2G0AsP6X5nPTB1f7Pa7U=)]. * **[pullpush-io/pullpush-io.github.io](https://github.com/pullpush-io/pullpush-io.github.io)** * **What it is:** The official hub for **PullPush**, a service explicitly built to keep third-party Reddit apps running after Pushshift was restricted. It relies on torrent networks and continuous community scraping [[4](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGU_FF6TE9vYnAE9ikFM3kul6y2Np1tPagZjh_TV72_FD8CCUTbhaQ4vtDe_26SU4JyjXBKWUVDPKwYrEhxJB3CU4EPfA0DBOi8Ey29K6jjHk2o8ztvx-Pq)]. * **How it Works:** It provides an API identical to the old Pushshift API, allowing you to query historical comments, user histories, and subreddit data [[4](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGU_FF6TE9vYnAE9ikFM3kul6y2Np1tPagZjh_TV72_FD8CCUTbhaQ4vtDe_26SU4JyjXBKWUVDPKwYrEhxJB3CU4EPfA0DBOi8Ey29K6jjHk2o8ztvx-Pq), [5](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG8KGCUJkgsAUG075ZUrtBufFhxCC88cU3jJdfxcpcfxIOVOJ1FzQTGowGV_IXqq5CB6YbsB-y-LJvGkb9Brso3x_qY1_bjR2jrnLDgyBum-V-S27a8CXV-jKzmcEjkiEeOmgjtu7NbDRs=)]. * **[maxjo020418/BAScraper](https://github.com/maxjo020418/BAScraper)** * **What it is:** An asynchronous Python wrapper designed specifically to fetch Reddit posts and comments using both **PullPush** and **Arctic-Shift** [[6](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGHgDpYdgEMfEMPTuoUCVq6jVdWWg87YSkX0na8Rg1-E6qyEFuQ06nfERo672GpnZWYCl7CrvokF-NhOg8ArfusSwM7Zc096JYFfz-Lu_IhTpVXKNPoctxZmUcA556melUeeiw=)]. * **Current Status:** Highly relevant in the post-2023 ecosystem. It bypasses the official Reddit API limitations, acting as a modern replacement for deprecated tools like PMAW (Pushshift Multithreaded API Wrapper) [[6](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGHgDpYdgEMfEMPTuoUCVq6jVdWWg87YSkX0na8Rg1-E6qyEFuQ06nfERo672GpnZWYCl7CrvokF-NhOg8ArfusSwM7Zc096JYFfz-Lu_IhTpVXKNPoctxZmUcA556melUeeiw=)]. ### 2. Search Engines & Web Interfaces If you don't want to code your own scraper and just want to deploy an existing search engine UI locally or via GitHub Pages, these repositories are ideal: * **[SachinRammoorthy/reddit-search](https://github.com/SachinRammoorthy/reddit-search)** * **What it is:** A custom, open-source search engine designed to navigate the "infamously hard-to-search Reddit" [[7](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHpVQXU-qAXq2lWJKJxQRQhavbVR6mXd4-dUtfoDc5wUh23inf_8RrZ7jQk2e60ESYya-DyWMLw25JNLpzdrwVHDc6zUmJX_hEfrfbwtyb9EBoFCQOo_hPyeLcRvqVQqQSztaaUi4lnJfufYA==)]. * **Features:** Offers advanced qualifier filtering, saved searches, and localized web-hosting capabilities (typically running on `localhost:8000`) [[7](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHpVQXU-qAXq2lWJKJxQRQhavbVR6mXd4-dUtfoDc5wUh23inf_8RrZ7jQk2e60ESYya-DyWMLw25JNLpzdrwVHDc6zUmJX_hEfrfbwtyb9EBoFCQOo_hPyeLcRvqVQqQSztaaUi4lnJfufYA==)]. * **[viralhysteria/reddit-archive](https://github.com/viralhysteria/reddit-archive)** * **What it is:** A Bootstrap-driven frontend web app for exploring Reddit data [[5](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG8KGCUJkgsAUG075ZUrtBufFhxCC88cU3jJdfxcpcfxIOVOJ1FzQTGowGV_IXqq5CB6YbsB-y-LJvGkb9Brso3x_qY1_bjR2jrnLDgyBum-V-S27a8CXV-jKzmcEjkiEeOmgjtu7NbDRs=)]. * **How it works:** It acts as an aesthetic search interface that plugs directly into the PullPush API. It is specifically useful for uncovering older content, or content that was removed/quarantined by Reddit moderators [[5](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG8KGCUJkgsAUG075ZUrtBufFhxCC88cU3jJdfxcpcfxIOVOJ1FzQTGowGV_IXqq5CB6YbsB-y-LJvGkb9Brso3x_qY1_bjR2jrnLDgyBum-V-S27a8CXV-jKzmcEjkiEeOmgjtu7NbDRs=)]. * **[reveddit/reveddit](https://github.com/reveddit/reveddit)** * **What it is:** The open-source code behind the famous *Reveddit.com* [[8](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEubrZ0FqL-6-DCT191j3ZV0e5eSZZGCvCwADlku5Yhbrje8ZCFjfxz5oZ-Z0mltV-tZhd0zCtvRMgstScFIQSJvvgxULTTmDPTYyG3eDhit9PftwltLwzGg2IZpaKv)]. * **Features:** While its reliance on external archives has required continuous patching over the years, the React-based frontend is built to compare what is currently visible on Reddit to what was originally archived, allowing you to easily spot shadow-banned comments or automated moderator removals [[8](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEubrZ0FqL-6-DCT191j3ZV0e5eSZZGCvCwADlku5Yhbrje8ZCFjfxz5oZ-Z0mltV-tZhd0zCtvRMgstScFIQSJvvgxULTTmDPTYyG3eDhit9PftwltLwzGg2IZpaKv)]. ### 3. Official API Tools & AI Analysis If you have an official Reddit Developer account and an OAuth token, you can use these repositories to search live Reddit data. * **[praw-dev/praw](https://github.com/praw-dev/praw)** * **What it is:** The **Python Reddit API Wrapper**. This is the gold-standard, official wrapper for interacting with Reddit's live ecosystem [[9](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQErvu-gbbRQ7ohwH0AavnuvFz1FQHlwnAWhQpKqq1GsCrCxmED5nAZWIz1ClMUhIKGRPpM3UbFcx2dJ2BAX6zICmV4lANpeSTKP37_k0SIaCcLLbLrnS5x-Ismn), [10](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFyhKPelg4sH8XnBiRsVXkxnShAFDw9DCdBOw8uUBTfr1hlmglbIllN1YQ_xPAiIvygBln5xZfccuei_tE-FSj0vcybPnN4mgxh10PnDCWAOTBG5cd0TRzPmHo13XszBHi1)]. * **Features:** Handles OAuth natively, respects Reddit's strict rate limits automatically (no need for manual `sleep()` calls), and allows you to build custom comment tree traversals and keyword searches over active subreddits [[9](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQErvu-gbbRQ7ohwH0AavnuvFz1FQHlwnAWhQpKqq1GsCrCxmED5nAZWIz1ClMUhIKGRPpM3UbFcx2dJ2BAX6zICmV4lANpeSTKP37_k0SIaCcLLbLrnS5x-Ismn), [10](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFyhKPelg4sH8XnBiRsVXkxnShAFDw9DCdBOw8uUBTfr1hlmglbIllN1YQ_xPAiIvygBln5xZfccuei_tE-FSj0vcybPnN4mgxh10PnDCWAOTBG5cd0TRzPmHo13XszBHi1)]. * **[shayonpal/check-reddit-posts](https://github.com/shayonpal/check-reddit-posts)** * **What it is:** A utility script known as "Reddit Post Analyzer." It uses Reddit's official API to fetch posts and their top comments [[11](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEjwFItG_egua7MHBnpZwRlJYHO_6Q2dsxQRy48R7tmaxfGMttb-7gnh5fz8oYGcHC0HnCJaVsRuWfkM9WSAfsIo53t2pgz3n8Gd-7NFfFtOSNyP1Uzjm7NC1Ii0B0qFx1zUK31WczCWAE=)]. * **Features:** It pipelines the scraped Reddit data directly into OpenAI's GPT-4 to perform sentiment analysis, uncover pain points, and extract business opportunities from subreddits, outputting the results into JSON and Markdown formats [[11](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEjwFItG_egua7MHBnpZwRlJYHO_6Q2dsxQRy48R7tmaxfGMttb-7gnh5fz8oYGcHC0HnCJaVsRuWfkM9WSAfsIo53t2pgz3n8Gd-7NFfFtOSNyP1Uzjm7NC1Ii0B0qFx1zUK31WczCWAE=)]. ### Summary Recommendation * If you need **live data or want to build a bot**, use **`praw-dev/praw`** [[9](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQErvu-gbbRQ7ohwH0AavnuvFz1FQHlwnAWhQpKqq1GsCrCxmED5nAZWIz1ClMUhIKGRPpM3UbFcx2dJ2BAX6zICmV4lANpeSTKP37_k0SIaCcLLbLrnS5x-Ismn)]. * If you are doing **data science, market research, or looking for deleted content**, use **`maxjo020418/BAScraper`** [[6](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGHgDpYdgEMfEMPTuoUCVq6jVdWWg87YSkX0na8Rg1-E6qyEFuQ06nfERo672GpnZWYCl7CrvokF-NhOg8ArfusSwM7Zc096JYFfz-Lu_IhTpVXKNPoctxZmUcA556melUeeiw=)] to interface with **`ArthurHeitmann/arctic_shift`** [[1](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGNQNzDj3-Q5hAO9c3zusC8BKyOqHk3MiGSZr30Nt1Haj6eIoiAb4MzVe9Fqvx-OyZxEjq2bVklcgZbbj60H2T3LI3XNofL9SWoqQaAvvnOcWNUI64s-2viK07O_tmw76mYP0mAvqTimdA=)] or **`pullpush-io`** [[4](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGU_FF6TE9vYnAE9ikFM3kul6y2Np1tPagZjh_TV72_FD8CCUTbhaQ4vtDe_26SU4JyjXBKWUVDPKwYrEhxJB3CU4EPfA0DBOi8Ey29K6jjHk2o8ztvx-Pq)]. * If you just want to **host a search interface**, clone **`viralhysteria/reddit-archive`** [[5](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG8KGCUJkgsAUG075ZUrtBufFhxCC88cU3jJdfxcpcfxIOVOJ1FzQTGowGV_IXqq5CB6YbsB-y-LJvGkb9Brso3x_qY1_bjR2jrnLDgyBum-V-S27a8CXV-jKzmcEjkiEeOmgjtu7NbDRs=)] or **`SachinRammoorthy/reddit-search`** [[7](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHpVQXU-qAXq2lWJKJxQRQhavbVR6mXd4-dUtfoDc5wUh23inf_8RrZ7jQk2e60ESYya-DyWMLw25JNLpzdrwVHDc6zUmJX_hEfrfbwtyb9EBoFCQOo_hPyeLcRvqVQqQSztaaUi4lnJfufYA==)]. ## Sources 1. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGNQNzDj3-Q5hAO9c3zusC8BKyOqHk3MiGSZr30Nt1Haj6eIoiAb4MzVe9Fqvx-OyZxEjq2bVklcgZbbj60H2T3LI3XNofL9SWoqQaAvvnOcWNUI64s-2viK07O_tmw76mYP0mAvqTimdA=) 2. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHp2C1w6yv0LMjQsKeOIQ9rRpOz673-iac1g26I3siySHcFR0uAvDek63vXuKKaKNARRzLQoYq2EX95Io1AJvy9ymeEl1TrJjNOg_pXTdVILu4rvol9xxlr1Z1g4xWrg6rF3xqVIK0MW25YwknyWST3fYU=) 3. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFrbm3-rRqe7And1WfM7SyBvYiNsiRd6S6ohIH9BxiN3XjEEDKLcwqRU92NahZSataFIHNcczEXR6Nq83TNbFgEf7HJKwxlH48PAkl2zsveoA3qNlTl9TsQo8JaUER2G0AsP6X5nPTB1f7Pa7U=) 4. [github.io](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGU_FF6TE9vYnAE9ikFM3kul6y2Np1tPagZjh_TV72_FD8CCUTbhaQ4vtDe_26SU4JyjXBKWUVDPKwYrEhxJB3CU4EPfA0DBOi8Ey29K6jjHk2o8ztvx-Pq) 5. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG8KGCUJkgsAUG075ZUrtBufFhxCC88cU3jJdfxcpcfxIOVOJ1FzQTGowGV_IXqq5CB6YbsB-y-LJvGkb9Brso3x_qY1_bjR2jrnLDgyBum-V-S27a8CXV-jKzmcEjkiEeOmgjtu7NbDRs=) 6. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGHgDpYdgEMfEMPTuoUCVq6jVdWWg87YSkX0na8Rg1-E6qyEFuQ06nfERo672GpnZWYCl7CrvokF-NhOg8ArfusSwM7Zc096JYFfz-Lu_IhTpVXKNPoctxZmUcA556melUeeiw=) 7. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHpVQXU-qAXq2lWJKJxQRQhavbVR6mXd4-dUtfoDc5wUh23inf_8RrZ7jQk2e60ESYya-DyWMLw25JNLpzdrwVHDc6zUmJX_hEfrfbwtyb9EBoFCQOo_hPyeLcRvqVQqQSztaaUi4lnJfufYA==) 8. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEubrZ0FqL-6-DCT191j3ZV0e5eSZZGCvCwADlku5Yhbrje8ZCFjfxz5oZ-Z0mltV-tZhd0zCtvRMgstScFIQSJvvgxULTTmDPTYyG3eDhit9PftwltLwzGg2IZpaKv) 9. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQErvu-gbbRQ7ohwH0AavnuvFz1FQHlwnAWhQpKqq1GsCrCxmED5nAZWIz1ClMUhIKGRPpM3UbFcx2dJ2BAX6zICmV4lANpeSTKP37_k0SIaCcLLbLrnS5x-Ismn) 10. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFyhKPelg4sH8XnBiRsVXkxnShAFDw9DCdBOw8uUBTfr1hlmglbIllN1YQ_xPAiIvygBln5xZfccuei_tE-FSj0vcybPnN4mgxh10PnDCWAOTBG5cd0TRzPmHo13XszBHi1) 11. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEjwFItG_egua7MHBnpZwRlJYHO_6Q2dsxQRy48R7tmaxfGMttb-7gnh5fz8oYGcHC0HnCJaVsRuWfkM9WSAfsIo53t2pgz3n8Gd-7NFfFtOSNyP1Uzjm7NC1Ii0B0qFx1zUK31WczCWAE=) --- *Search queries: "reddit search engine github", "pushshift alternative github", "github repo search reddit posts OR comments", ""github" search reddit posts comments", "github praw", "github repo arctic shift", "github repo pmaw", "github repo arctic shift reddit"*