This page describes the required functionality for the "AI Agents" plugin. Please don't hesitate to contact lucian@amplenote or lu.cian on Discord for clarifications.
An AI Agent is a tool that has (1) the LLM generating capabilities of something like ChatGPT and (2) the privileges to act upon requests, for example by invoking tools or calling APIs.
The AI Agents we want inside Amplenote should be able to converse, search the web, rewrite paragraphs, etc. while also executing complex requests from the user, such as creating and populating notes, creating and editing tasks, reading and summarizing notes, checking the user's Task Domains for schedules, etc.
Example: A conversation with an agent where the user submits the following prompts:
Look up the 10 tallest buildings on Earth and find the names of their respective lead architects
Look up the nationality, age and website for each of those persons
Add those 10 architects to my CRM
Add 30-minute blocks to my schedule for each person, reminding me to cold email them next week
Would have the expected effect of prompting an LLM for the requested factual data/searching the web for finding the required answers, while also:
Creating one note for each architect
Filling each note with the personal data collected (age, nationality, website)
Making sure the notes are tagged as "CRM" or something similar
Creating 10 new 30-minute tasks in empty calendar slots called "Cold email <NAME>", that link to each of the respective notes
This is just an example, but it illustrates how we can build intelligence into an agent by teaching it that generally "a person = a note", "adding something to a schedule = first checking task domains for slots where nothing is scheduled", etc.
Detailed requirements
Invocation & Context
This plugin can be invoked from inside a note with insertText
or replaceText
Or app-wide from appOption
The options avaialble for each invocation method will probably differ
The plugin will consider as "Context" the following:
The current line/task, when insertText
is used on a non-empty line/task
The entire section/heading, when insertText
is used on an empty line or inside a table
The selection when replaceText
is used
The entire open note when appOption
is used
The plugin will also offer an appOption
that allows you to simply "Chat" (ie. no context is used)
No matter the invocation method, the AI chat interface will be opened in the Peek Viewer
LLM capabilities
Should support calling gpt-4 directly and support displaying in the Peek Viewer embed any type of output that ChatGPT supports displaying
Should support attaching local files to a chat prompt (if possible via API at the moment)
Should support searching the web when prompted
Should support reading web page content given a valid URL
Amplenote capabilities
Behind the scenes, the plugin will have a comprehensive and modular interface for interacting with note contents, such that the AI Agent can call combinations of APIs in sequence to execute on a user's request. Examples:
Create a new note and give it a list of tags
Create a new task inside a specific note
Check a particular Task Domain and fetch a list of scheduled tasks
Find a task that was just created by the agent itself and modify its properties or delete it
Fetch notes by name or tag and read and process the contents
Create links to other notes
Apply markdown formatting
Etc.
When the user asks the agent to execute something, the agent is able to string a series of APIs together to produce the desired output (example)
Safety guards
When executing on requests that require writing notes or tasks, every individual action that would do so (eg. creating a note, editing a task, editing etc.) will first output the contents inside the chat interface and then will ask for manual confirmation from the user before proceeding to apply that operation.
Predefined prompts
The following should be added as one-click prompts available from insertText
and replaceText
:
Writing:
Write a blog post
Generate an outline for an article
Write a short summary
List key takeaways
Fix grammar and spelling
Improve writing
Continue writing this sentence
Suggest titles
Generate a counter argument
Generate an email
List action items
Custom prompts/agent editor
The plugin allows the user to create custom prompts to pass to the LLM
The plugin allows the user to reference the "context" as described above by enclosing it inside curly brackets {context}
. Similarly for {attachments}
passed to the agent
The prompt can be saved with a name, which will be visible in insertText
and replaceText
contexts.
Peek viewer interface
The main interface for the AI Agent should work like a chat with extra features
Chat:
Should allow sending messages to the LLM and reading responses
Should show a scrollable history of messages between the user and the agent
Should allow uploading files to a message
For every response, should allow copying the output or inserting it in a note
Inserting in a note options:
Insert in new or existing note (prompt the user for these details)
Insert at the current position (when invoked via insertText
)
Replace current selection (when invoked via replaceText
)