Initial commit - WhyRating Engine (Google Reviews Scraper)
This commit is contained in:
142
CLAUDE.md
Normal file
142
CLAUDE.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# Google Reviews Scraper Pro - Claude Code Instructions
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Run with NUC Database (Recommended)
|
||||
The PostgreSQL database is hosted on the NUC server. Only the API runs locally.
|
||||
|
||||
```bash
|
||||
# Use NUC database config
|
||||
cp .env.nuc .env
|
||||
|
||||
# Start API only (connects to NUC database)
|
||||
docker compose -f docker-compose.production.yml -f docker-compose.nuc.yml up -d
|
||||
|
||||
# View logs
|
||||
docker compose -f docker-compose.production.yml logs -f api
|
||||
```
|
||||
|
||||
### Run Fully Local (Legacy)
|
||||
Runs both PostgreSQL and API locally.
|
||||
|
||||
```bash
|
||||
# Use local database config
|
||||
cp .env.example .env
|
||||
# Edit .env with your settings
|
||||
|
||||
# Start all services
|
||||
docker compose -f docker-compose.production.yml up -d
|
||||
```
|
||||
|
||||
## NUC Database Connection
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Host | 192.168.1.3 |
|
||||
| Port | 5437 |
|
||||
| Database | scraper |
|
||||
| User | scraper |
|
||||
| Password | scraper_nuc_2026 |
|
||||
| Coolify UUID | g4s8w4csk8s8ocswg48kkogo |
|
||||
|
||||
```bash
|
||||
# Direct connection
|
||||
psql postgresql://scraper:scraper_nuc_2026@192.168.1.3:5437/scraper
|
||||
|
||||
# Via SSH tunnel (if needed)
|
||||
ssh -L 5437:localhost:5437 nuc
|
||||
```
|
||||
|
||||
## Service URLs
|
||||
|
||||
| Service | URL |
|
||||
|---------|-----|
|
||||
| API | http://localhost:8001 |
|
||||
| API Docs | http://localhost:8001/docs |
|
||||
| VNC (browser debugging) | http://localhost:6080 |
|
||||
| VNC (client) | vnc://localhost:5900 |
|
||||
|
||||
## Common Commands
|
||||
|
||||
```bash
|
||||
# Start services
|
||||
docker compose -f docker-compose.production.yml -f docker-compose.nuc.yml up -d
|
||||
|
||||
# Stop services
|
||||
docker compose -f docker-compose.production.yml -f docker-compose.nuc.yml down
|
||||
|
||||
# View API logs
|
||||
docker logs -f scraper-api
|
||||
|
||||
# Rebuild API after code changes
|
||||
docker compose -f docker-compose.production.yml -f docker-compose.nuc.yml up -d --build api
|
||||
|
||||
# Run a scrape job (example)
|
||||
curl -X POST http://localhost:8001/api/jobs \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url": "https://www.google.com/maps/place/..."}'
|
||||
|
||||
# Check job status
|
||||
curl http://localhost:8001/api/jobs/{job_id}
|
||||
```
|
||||
|
||||
## Database Management
|
||||
|
||||
```bash
|
||||
# Connect to NUC database
|
||||
docker run --rm -it postgres:15-alpine psql postgresql://scraper:scraper_nuc_2026@192.168.1.3:5437/scraper
|
||||
|
||||
# Backup database
|
||||
ssh nuc "docker exec postgres-g4s8w4csk8s8ocswg48kkogo pg_dump -U scraper scraper" > backup.sql
|
||||
|
||||
# Restore database
|
||||
cat backup.sql | ssh nuc "docker exec -i postgres-g4s8w4csk8s8ocswg48kkogo psql -U scraper scraper"
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
├── api/ # FastAPI backend
|
||||
├── packages/
|
||||
│ ├── pipeline-core/ # Shared pipeline utilities
|
||||
│ └── reviewiq-pipeline/ # Review analysis pipeline
|
||||
├── web/ # Next.js frontend (optional)
|
||||
├── db/init/ # Database initialization scripts
|
||||
├── docker-compose.production.yml # Main compose file
|
||||
├── docker-compose.nuc.yml # NUC database override
|
||||
├── .env.nuc # NUC environment config
|
||||
└── Dockerfile # API container build
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### API can't connect to NUC database
|
||||
```bash
|
||||
# Check NUC is reachable
|
||||
nc -zv 192.168.1.3 5437
|
||||
|
||||
# Check database is running
|
||||
ssh nuc "docker ps | grep postgres-g4s8w4csk8s8ocswg48kkogo"
|
||||
|
||||
# Restart database on NUC
|
||||
ssh nuc "docker restart postgres-g4s8w4csk8s8ocswg48kkogo"
|
||||
```
|
||||
|
||||
### Chrome/Scraping issues
|
||||
```bash
|
||||
# Check VNC for visual debugging
|
||||
open http://localhost:6080
|
||||
|
||||
# Increase shared memory if crashes
|
||||
# Edit docker-compose: shm_size: 4gb
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| DATABASE_URL | PostgreSQL connection string | (required) |
|
||||
| API_BASE_URL | Public API URL | http://localhost:8001 |
|
||||
| MAX_CONCURRENT_JOBS | Parallel scrape jobs | 5 |
|
||||
| OPENAI_API_KEY | For ReviewIQ analysis | (optional) |
|
||||
| ANTHROPIC_API_KEY | For ReviewIQ analysis | (optional) |
|
||||
Reference in New Issue
Block a user