2023-08-09 project diary entry
Project futures; planning notes:
- focus on extracting topics from documents and collections of documents
- use more than one framework (e.g., langchain, llama-index, ?)
- use OpenAI and other models, including
huggingface
pipelines,gpt4all
, ? - explore value of organizing
aipraxisLab
repository along model or framework axes (?) - set up three different document collections for testing
- keep notes on Python setup details (
venv
,pip
, what else?) - wrap-up individual experiments with notes and possibly put into separate folders
- keep the lab benches clean, with only a few exp'ts going at a time (perfect use for a Kanban board); in fact, limit number of open exp'ts to three or fewer.
Peter Kaminsky suggestion from Massive Wiki Wednesday 2023-08-09 call:
Idea for Category / Topic Mapping for Articles
- have ChatGPT make a list of categories or topics
- then have it make sub-categories and sub-sub-categories, etc. as desired
- have ChatGPT synthesize articles for the leaf categories
- generate embeddings for the synthetic articles
- do vector database matching with embeddings for real articles