AI agents ((Amplenote templates plugin bounty requirements))

Published by Lucian October 21, 2024

Last changed about 1 year ago

views

This page describes the required functionality for the "AI Agents" plugin. Please don't hesitate to contact lucian@amplenote or lu.cian on Discord for clarifications.

An AI Agent is a tool that has (1) the LLM generating capabilities of something like ChatGPT and (2) the privileges to act upon requests, for example by invoking tools or calling APIs.

The AI Agents we want inside Amplenote should be able to converse, search the web, rewrite paragraphs, etc. while also executing complex requests from the user, such as creating and populating notes, creating and editing tasks, reading and summarizing notes, checking the user's Task Domains for schedules, etc.

Example: A conversation with an agent where the user submits the following prompts:

Look up the 10 tallest buildings on Earth and find the names of their respective lead architects
Look up the nationality, age and website for each of those persons
Add those 10 architects to my CRM
Add 30-minute blocks to my schedule for each person, reminding me to cold email them next week

Would have the expected effect of prompting an LLM for the requested factual data/searching the web for finding the required answers, while also:

Creating one note for each architect
Filling each note with the personal data collected (age, nationality, website)
Making sure the notes are tagged as "CRM" or something similar
Creating 10 new 30-minute tasks in empty calendar slots called "Cold email <NAME>", that link to each of the respective notes

This is just an example, but it illustrates how we can build intelligence into an agent by teaching it that generally "a person = a note", "adding something to a schedule = first checking task domains for slots where nothing is scheduled", etc.

Detailed requirements

Invocation & Context

This plugin can be invoked from inside a note with insertText or replaceText

Or app-wide from appOption

The options avaialble for each invocation method will probably differ

The plugin will consider as "Context" the following:

The current line/task, when insertText is used on a non-empty line/task

The entire section/heading, when insertText is used on an empty line or inside a table

The selection when replaceText is used

The entire open note when appOption is used

The plugin will also offer an appOption that allows you to simply "Chat" (ie. no context is used)

No matter the invocation method, the AI chat interface will be opened in the Peek Viewer

LLM capabilities

Should support calling gpt-4 directly and support displaying in the Peek Viewer embed any type of output that ChatGPT supports displaying

Should support attaching local files to a chat prompt (if possible via API at the moment)

Should support searching the web when prompted

Should support reading web page content given a valid URL

Amplenote capabilities

Behind the scenes, the plugin will have a comprehensive and modular interface for interacting with note contents, such that the AI Agent can call combinations of APIs in sequence to execute on a user's request. Examples:

Create a new note and give it a list of tags

Create a new task inside a specific note

Check a particular Task Domain and fetch a list of scheduled tasks

Find a task that was just created by the agent itself and modify its properties or delete it

Fetch notes by name or tag and read and process the contents

Create links to other notes

Apply markdown formatting

Etc.

When the user asks the agent to execute something, the agent is able to string a series of APIs together to produce the desired output (example)

Is this even possible?

Safety guards

When executing on requests that require writing notes or tasks, every individual action that would do so (eg. creating a note, editing a task, editing etc.) will first output the contents inside the chat interface and then will ask for manual confirmation from the user before proceeding to apply that operation.

Predefined prompts

The following should be added as one-click prompts available from insertText and replaceText:

Writing:

Write a blog post

Generate an outline for an article

Write a short summary

List key takeaways

Fix grammar and spelling

Improve writing

Continue writing this sentence

Suggest titles

Generate a counter argument

Generate an email

List action items

Custom prompts/agent editor

The plugin allows the user to create custom prompts to pass to the LLM

The plugin allows the user to reference the "context" as described above by enclosing it inside curly brackets {context}. Similarly for {attachments} passed to the agent

The prompt can be saved with a name, which will be visible in insertText and replaceText contexts.

Peek viewer interface

The main interface for the AI Agent should work like a chat with extra features

Chat:

Should allow sending messages to the LLM and reading responses

Should show a scrollable history of messages between the user and the agent

Should allow uploading files to a message

For every response, should allow copying the output or inserting it in a note

Inserting in a note options:

Insert in new or existing note (prompt the user for these details)

Insert at the current position (when invoked via insertText)

Replace current selection (when invoked via replaceText)