Drop ephemeral sessions; refactor agent lifecycle around RunAsync + bus-driven stop #93
No reviewers
Labels
No labels
bug
commercial
documentation
duplicate
enhancement
feature
good first issue
help wanted
invalid
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
jasoncouture/llama-shears!93
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "claude-is-a-moron"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
EphemeralSession/EphemeralSessionFactory/SessionReplyToolmachinery; sessions are unified back intoAgent.SessionIdcarries an auto-generated v7Guidso the same agent can have multiple distinct boots;DataKey/DefaultSessionName/IsDefaultmove onto the record.IAgent.RunAsyncreplaces the start-then-loop split; agent shutdown is now triggered bycommand:agent-stopevents withagent:starting/started/stopping/stoppedlifecycle notifications.AgentManagerkeeps lifecycle and the loaded-handle dictionary; the blank-EC + scoped DI + data-context boot moves intoAgentFactorybehindIAgentFactory.AgentHandleowns child sub-handles.ExecuteAsyncdrains every handle viaCommand.AgentStopbroadcast before disposing.Test plan
dotnet buildcleandotnet testclean (unit + integration)🤖 Generated with Claude Code
Pull request overview
This PR removes the separate ephemeral-session runtime and refactors agent lifecycle around
IAgent.RunAsync,AgentFactory,AgentHandle, and bus-drivencommand:agent-stopshutdown events.Changes:
session_replytool, docs, and tests.Agent,AgentManager, and DI registration aroundAgentFactory.Reviewed changes
Copilot reviewed 52 out of 52 changed files in this pull request and generated 15 comments.
Show a summary per file
SessionId.RunAsync.RunAsync.RunAsync.RunAsync.RunAsync.StartAsyncwithRunAsync.IAgentFactoryand removes ephemeral factory registration.AgentStop.RunAsync.💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
@ -0,0 +7,4 @@## Types- [<G>$1E5D4A79A014BD10CBE0F98F7F208BE1](<G>$1E5D4A79A014BD10CBE0F98F7F208BE1.md)These generated documentation pages expose compiler-generated extension-block names (
<G>$...,<M>$...) instead of the publicSessionExtensions.GetSessionId/TryGetSessionIdAPI. Publishing these links makes the API docs hard to use and exposes implementation artifacts; the docs generation should suppress extension-block backing types or flatten the methods underSessionExtensions.@ -111,1 +148,4 @@public async ValueTask HandleAsync(IEventEnvelope<AgentStopRequest> envelope, CancellationToken cancellationToken){if (_shutdown.IsCancellationRequested) return;This opens the default/root context regardless of the active
SessionId, so non-default boots will read and drive the main transcript instead of their own isolated session transcript. Use the session id from the data scope when opening the context so child/sub-session runs don't share or corrupt the default conversation history.@ -112,0 +155,4 @@if (envelope.Data.SessionId != ownSessionId){return;}RunAsyncno longer enforces the previous single-start invariant. Calling it twice on the sameAgentwill open another context/loop over the same subscriptions and queue, leading to competing consumers and duplicate lifecycle/service starts; keep an interlocked/idempotency guard even with the new RunAsync lifecycle.The awaited
agent-stophandler only cancels the loop and returns; it does not wait for_loopStatus, so callers awaitingPublishAsync(command:agent-stop, ...)can observe completion beforestopping/stoppedhave been published and before agent services have drained. This contradicts the new stop contract and makes bus-driven shutdown races likely.ShutdownLoopAsynctreats_started == falseas “no loop to wait for,” butRunAsyncsets_startedonly after opening the context and publishingstarting. If disposal/stop races during that startup window,_loopStatusis completed early and the scope can be disposed whileRunAsyncis still executing startup code.PublishLifecycleEventAsyncnow sendsAgentEventInformationforagent:idleandagent:busy, but existing subscribers such asCompactionAgentServicesubscribe to those lifecycle events withAgentLifecycleMarker. Because the event bus dispatches by payload type, idle/busy subscribers will no longer receive these events, so idle compaction will stop running.@ -0,0 +38,4 @@globals[pair.Key] = pair.Value;}// These are authoratative, not to be overwriten by the caller.Correct the typos in this comment: “authoratative” should be “authoritative” and “overwriten” should be “overwritten”.
@ -0,0 +41,4 @@// These are authoratative, not to be overwriten by the caller.globals[AgentConfig.DataKey] = config;globals[ModelConfiguration.DataKey] = config.Model;globals[SessionId.DataKey] = session;The implementation applies caller-supplied
datafirst and then overwritesAgentConfig,ModelConfiguration, andSessionId, while the publicIAgentFactorycontract says caller data overlays the defaults and “caller wins.” This mismatch will surprise callers that rely on the documented overlay behavior; either the implementation or the public contract should be changed to match.@ -0,0 +49,4 @@var dataContextFactory = scope.ServiceProvider.GetRequiredService<IDataContextFactory>();dataContextFactory.CreateContext(config.Id);var dataProviders = scope.ServiceProvider.GetScopedDataProviders();await dataContextFactory.InitializeAsync(config.Id, dataProviders, globals, cancellationToken);The data context is still keyed only by
config.Id, so starting a second boot/session for the same agent will hitDataContextFactory.CreateContext's duplicate-key check and fail. Since the factory now accepts aSessionIdspecifically to support distinct boots, the context key needs to include the session identity (and corresponding cleanup must use the same key).@ -0,0 +53,4 @@_ = scope.ServiceProvider.GetRequiredService<ILanguageModel>();var agent = scope.ServiceProvider.GetRequiredService<IAgent>();var runTask = agent.RunAsync();StartAgentAsyncreturns immediately after callingagent.RunAsync()without awaiting any startup milestone or checking whether the returned task has already faulted. Because exceptions fromRunAsyncare captured on the task, failures opening the context, publishingstarting, or starting agent services can still produce anAgentHandlethat the manager records as loaded even though the agent loop is dead.This snapshots
_loaded.Keysfrom a plainDictionarywithout holding_mutex, while load/unload handlers can mutate the same dictionary under the mutex. During shutdown a concurrent load/unload can throw or produce inconsistent draining; take the snapshot while holding the mutex (or use a concurrent collection consistently).This
ContinueWithuses anasyncdelegate, which produces a nestedTask<Task>. The list stores the outer task, soTask.WhenAll(tasks)can complete beforelocalHandle.DisposeAsync()has finished, allowing host shutdown to exit without actually draining/disposal of the handles. Use an async helper or unwrap/await the inner task.This log records
handle.Id.Name, which is the session name (for the default boot, "default") rather than the agent id being stopped. Stop logs will therefore sayStopped agent 'default'for every default session, making operational diagnostics misleading.@ -12,9 +12,12 @@ namespace LlamaShears.Core.Abstractions.Agent.Sessions;/// e.g. <c>telegram:123456</c>, round-trip cleanly).The XML docs still describe the canonical form as
agentId:defaultChanneland say everything after the first colon is the channel/name, butToStringandTryParsenow requireagentId:{guid}:name. Public docs forSessionIdneed to be updated so consumers don't generate identifiers that can no longer parse.@ -65,0 +69,4 @@/// <summary>Agent shutdown is beginning — children are being disposed before the agent's own scope tears down.</summary>public static EventType Stopping {get;} = new EventType(Sources.Agent, "stopping");/// <summary>Agent shutdown is complete — scope disposed, data context deleted.</summary>public static EventType Stopped {get;} = new EventType(Sources.Agent, "stopped");These property accessors don't follow the spacing used by the surrounding well-known event declarations (
{ get; }). Keeping the same formatting here avoids churn in the public API definitions.