Deployment

Activating Digital Life

How does one bring a digital life into existence in the digital world after its creation, actualizing its value and potential? The answer lies in deployment. Deployment is the critical process of transforming digital life from data into an interactive entity.

Deploying digital life is a multi-step process involving creation, integration, and activation. This section will guide you through efficiently deploying digital life using the technological stack provided by Nuwa Lab and the BRC-1111 protocol specifications. Here are the deployment process steps and descriptions:

Step 1: Acquiring Digital Life Data

Obtaining the necessary digital life data is a prerequisite for deployment. You can do this in two ways:

  • Create and Design Independently: Access the Nuwa digital life creation platform or other digital life creation tools supporting the BRC-1111 protocol, and follow the guidance to create a digital life entity that meets your needs.

  • Purchase from the Marketplace: Select and buy on the Nuwa digital life trading platform or other marketplaces that allow transactions of BRC-1111 protocol digital life entities.

Step 2: Integration and Testing

Deploying a digital life entity involves realizing its ability for recognition, intelligent analysis and decision-making, and rendering synthesis.

Intelligent analysis and decision-making constitute the "soul" of the digital life, driven by LLMs and are core to the digital being; this module must be integrated.

For the other two aspects—recognition and final synthesis rendering—you can choose from various modules for integration, such as:

  • Recognition: Text input module, ASR speech recognition module, image recognition module, motion capture module, etc.

  • Final Synthesis Rendering: 3D game engine module, Live2D module, TTS voice synthesis module, expression and movement management module, lip-sync module for speech, etc.

By integrating these modules, digital life can be given capabilities like voice conversation, 3D interaction, lively facial expressions and movements, and speech lip-syncing. These modules can be selectively integrated and expanded based on your project needs.

2.1 Integrating Intelligent Analysis and Decision-Making Module

Digital life leverages LLMs for its intelligent analysis and decision-making, which can be achieved through three ways: in-house deployment of LLMs, self-deployment of open-source LLMs, or by invoking APIs from third-party companies.

Here's a brief comparison of the three approaches:

Comparison Item
Self-trained LLM
Self-deployed open source LLM
3rd party API

Initial Cost

Very high. Involves significant data acquisition and processing costs, expensive hardware (e.g., GPUs), and human resources costs of R&D personnel.

Moderate. Mainly covers equipment and manpower costs, saving on design and training expenses compared to in-house models.

Low. Mainly the cost of API calls, without the need to consider equipment and human resources.

Operational Cost

High. Requires ongoing investment for maintenance and updates of the model.

Moderate. Needs regular updates of the open-source model to ensure performance and security.

Low. The API provider is responsible for the maintenance and upgrades of the model.

Time

Long development cycle is needed, from model design, training to deployment might take months to years.

Relatively short. Benefits from mature open-source models, mainly focusing on deployment and optimization.

Shortest. Direct API integration, without any additional deployment process.

Flexibility

High. Allows for full customization of the model, optimized for specific business needs.

Moderate. While open-source models offer some degree of customization, they are still limited compared to in-house models.

Low. Limited to the functionalities and configuration options provided by the API provider.

Technical Difficulty

Very high. Requires expertise in deep learning, natural language processing, and other areas.

Moderate. Requires some technical background to understand the principles and structures of open-source models and to make appropriate modifications.

Low. Only need to understand how to integrate and call APIs, no need to delve into the inner workings of the model.

Long-term Maintenance

Somewhat complex. Needs continuous optimization of the model in response to industry changes.

Relatively easy. Depends on the ongoing support and updates from the open-source community.

Very easy. The maintenance and updates of the model are handled by the API provider.

  1. Self-trained LLM: Offers full control over the model and data, ensuring privacy and customization. However, it requires significant computational resources and expertise in model training and maintenance.

  2. Self-deployed open source LLM: Provides a balance between control and resource investment, but still requires technical expertise and computational resources for deployment and upkeep.

  3. 3rd party API: Offers ease of use with minimal technical overhead, allowing quick integration and access to state-of-the-art models. However, it may involve ongoing costs and potential concerns over data privacy and service reliability.

Deciding on a method primarily depends on the specific needs, budget, timeline, and technical capabilities of your project. For most small businesses or individuals, utilizing the APIs of third-party companies represents the most direct and cost-effective solution. For enterprises requiring high customization of AI models, or those with sufficient resources for long-term investment, opting to develop in-house or deploy using open-source models may be more appropriate.

Below, we will explain step by step how to integrate digital entities created following the BRC-1111 standard with LLMs, endowing them with intelligence.

Step 1: Connect to a Large Language Model

Selecting an appropriate large language model is crucial to the ultimate interaction quality of the digital entities. Depending on the requirements of your project, you may choose from various third-party API services or open-source models. Listed below are some of the popular models and APIs, though you are encouraged to explore more options based on your specific circumstances. During the integration process, it is imperative to thoroughly read the relevant documentation to ensure proper integration.

Popular Open Source LLMs
Popular Third-party Company APIs

Llama

Official Websites::https://llama.meta.com/

OpenAI's GPT series Official Websites::https://openai.com/product API documentation:https://platform.openai.com/docs/introduction

Anthropic's Claude series Official Websites::https://www.anthropic.com/claude API documentation:https://www.anthropic.com/api

ChatGLM Open Source Repository URL:https://github.com/THUDM/ChatGLM-6B

Step 2: Construct a Prompt Framework Based on the BRC-1111 Protocol

LLMs interact through text and can understand interactive tasks, providing stable responses that meet expectations. It’s necessary to establish a task framework that adheres to protocol standards. This framework should encapsulate the data of the digital entities that need to interact with it, along with user interaction content. The final task prompts are generated and sent to the large language model to obtain decision-making results inferred through AI technology.

To ensure effective interaction with LLMs, it is crucial to construct a task framework that conforms to the BRC-1111 standard. Within this framework, digital entity data and user interaction information are filled, generating task prompts. These prompts are processed through the large language model, resulting in intelligent decision-making outputs.

Below is a task framework based on the BRC-1111 standard, for use with digital entities generated in accordance with the BRC-1111 protocol:

Structure Name
Structure Identifier
Content Example

Initialization

${Initialization}

Main Prompt

${Main}

You are {{char}}, and you are in a fictional, endless role-play (C) with {{user}}. You must strictly immerse yourself in the personality of {{char}}.

Info Start Marker

${InfoStart}

Here is the basic information of the interactive fiction:

Scenario Start Marker

${ScenarioStart}

Scenario of the fictional story:

Scenario

${Scenario}

{Scenario}

Scenario End Marker

${ScenarioEnd}

NPC Start Marker

${NPCStart}

Character card of NPC:

NPC Character Description

${CharDescription}

<{{char}}>{{charDescription}} \n</{{char}}>

NPC End Marker

${NPCEnd}

Player Start Marker

${PlayerStart}

Character card of the player:

Player Description

${UsersDescription}

{{user}},{{userDescription}}

Player End Marker

${PlayerEnd}

Info End Marker

${InfoEnd}

Examples Start Marker

${mesExamplesStart}

Here are some examples of the interaction:

Examples

${mesExamples}

{{mesExamples}}

Examples End Marker

${mesExamplesEnd}

Interaction History Start Marker

${InteractionHistoryStart}

[EXPORT INTERACTION HISTORY] Here is the history of the interaction between a human player and player's interactive fiction game assistant:

Interaction History

${InteractionHistory}

Assistant:{output1} Human({user}):{input1} Assistant:{output2} Human({user}):{input2} Assistant:{output3} ...

Interaction History End Marker

${InteractionHistoryEnd}

Current Input Start Marker

${CurrentInputStart}

Here is the player current interaction:

Current Input

${CurrentInput}

Human({{user}}):{{currentInput}}

Current Input End Marker

${CurrentInputEnd}

Response Generation

${Response}

(Write {{Char}}'s next reply in a fictional chat. Write 1 reply only in internet RP style. Be proactive, creative, and drive the plot and conversation forward. Always stay in character and avoid repetition.)

Interaction Startup

${Startup}

Here is the subsequent of the interactive fiction after processing the steps in : {{Char}}:

As you can see, in the template content, there are words encapsulated by double curly brackets, like {{char}}. These are known as replacement tokens. They are placeholders within the framework, which will ultimately be replaced with actual data. By utilizing a Handlebars evaluator (refer to https://handlebarsjs.com/guide/), you can replace the following parameters:

Replacement Tokens: List of replacement tokens at the moment of generation:

  • {{user}} => Username.

  • {{charPrompt}} => Character's main prompt override

  • {{charJailbreak}} => Character's jailbreak prompt override

  • {{char}} => Character name.

  • {{description}} => Description of the character.

  • {{scenario}} => Scenario or conversation scenario override for the character (if set).

  • {{personality}} => Personality of the character.

  • {{persona}} => User's character description.

  • {{mesExamples}} => Dialogue examples of the character (unchanged and not split).

  • {{lastMessageId}} => Last chat message ID.

  • {{lastMessage}} => Text of the last chat message.

  • {{currentSwipeId}} => The 1-based ID of the last swipe message currently displayed.

  • {{lastSwipeId}} => Number of swipes in the last chat message.

  • {{original}} can be used in the prompt override fields (main prompt and jailbreak) to include the default prompt from the system settings respectively. Applicable only for chat completion APIs and guide mode.

  • {{time}} => Current system time.

  • {{time_UTC±X}} => Current time in a specified UTC offset (timezone), e.g., for UTC+02:00 use {{time_UTC+2}}.

  • {{date}} => Current system date.

  • {{input}} => Content of the user input bar.

  • {{weekday}} => Current weekday

  • {{isotime}} => Current ISO date (YYYY-MM-DD)

  • {{isodate}} => Current ISO time (24-hour format)

  • {{idle_duration}} => Human-readable string of the time range since the last user message was sent (e.g., 4 hours, 1 day).

  • {{random:(args)}} => Returns a random item from the list. (e.g., {{random:1,2,3,4}} would randomly return one of the four numbers). Also applies to lists of texts.

  • {{roll:(formula)}} => Generates a random value using the provided dice formula (utilizing D&D dice syntax: XdY+Z). For example, {{roll:d6}} would generate a random value within the range 1-6 (standard six-sided die).

  • {{bias "text here"}} => Sets AI behavior bias until the next user input. Quotes around the text are important.

  • {{// (comment)}} => Allows for comments, which will be replaced by blank content. Invisible to AI.

2.2 Integrating "Contextual Awareness" and "Final Synthetic Rendering" Modules

"Contextual Awareness" and "Final Synthetic Rendering" are key pillars of the interactivity and user experience with digital entities, determining how users can interact with digital life and its presentation forms. Below is a detailed description of the integration and testing process for each component.

Contextual Awareness Stage:

At this stage, the goal is to endow digital entities with the ability to perceive and understand external information. This process can be implemented through the integration of one or more of the following modules, according to the needs of your project:

  • Text Input Module: Allows digital entities to receive information via text.

  • ASR (Automatic Speech Recognition) Module: Enables digital entities to understand human voice commands.

  • Image Recognition Module: Grants digital entities the ability to discern and understand image content.

  • Motion Capture Module: Allows digital entities to respond to and mimic human movements.

To integrate these modules, you may opt for open-source projects or third-party API services to shorten your project development cycle.

Below, we list some popular open-source projects and APIs provided by third-party companies for you to choose from, according to your project needs.

Automatic Speech Recognition (ASR)

Open-Source Projects

Third-party Company APIs:

OpenAI’s Whisper

Project URL: https://github.com/openai/whisper

Microsoft’s Speech-to-Text Official Website: https://azure.microsoft.com/products/ai-services/speech-to-text/

iFlytek’s Speech-to-Text Official Website: https://www.xfyun.cn/services/voicedictation#anchor742544

Text-to-Speech (TTS)

Open-Source Projects

Third-party Company APIs:

iFlytek’s Speech-to-Text Official Website: https://www.xfyun.cn/services/voicedictation#anchor742544

Final Composition and Rendering Phase

The purpose of this phase is to visually present the internal processing results of the digital entities to users. This includes, but is not limited to:

  • 3D Game Engine Module: Creates complex three-dimensional interactive environments.

  • Live2D Module: Achieves dynamic representation of two-dimensional images.

  • TTS (Text-to-Speech) Module: Converts text information into voice output to enhance user experience.

  • Facial Expressions and Motion Management Module: Controls the facial expressions and movements of the digital entities, making them more lifelike.

  • Lip Sync and Pronunciation Module: Ensures that the voice output of the digital entities is synchronized with their lip movements and expressions, intensifying the immersion experience.

When selecting and integrating these modules, the key is to ensure that they can work efficiently and smoothly, as well as integrate well with front-end display technologies such as WebGL, HTML5, etc. During the testing process, the modules should undergo stress tests to ensure that they provide a stable and responsive user experience across various devices and network conditions.

Through the steps outlined above, leveraging the capabilities of Nuwa Lab and the BRC-1111 standard, you will be able to efficiently deploy and activate digital entities with high levels of interactivity and intelligent decision-making capabilities, providing users with rich, interactive digital experiences.

Advantages of Creating and Deploying Digital Entities Following the BRC-1111 Protocol

Unified Data and Interaction Standards

Under the innovative vision of Nuwa Lab for digital entities, a unified data standard is not just the basis for operations but also ensures that digital entities can interact seamlessly and be transferred across various platforms and software. By adhering to the consistent data specifications and interaction standards defined by the BRC-1111 protocol, any digital entity can be effectively deployed on platforms that support this protocol, achieving true interoperability and flexibility.

User-friendly and Comprehensive Technical Support

Nuwa Lab is dedicated to lowering the technical barriers for developers, promoting the widespread application of digital entities. It provides powerful, user-friendly APIs, SDKs, and development toolkits, enabling developers to easily deploy and drive digital entities. Whether it's leveraging cutting-edge LLMs, 3D game engine technology, ASR and TTS language technologies, STA voice animation synthesis, or the FACS facial expression system, our tools facilitate rapid implementation, thereby endowing digital entities with complex decision-making, high interactivity, and emotional expression capabilities.

Cross-Chain Technology and Open Source Collaboration

By adopting cross-chain technology, we aim to break down barriers between different blockchains, allowing digital entities to freely move across various ecosystems, significantly expanding their application scenarios and value. At the same time, we collaborate closely with the open-source community, integrating LLMs such as Llama and Grok, and image synthesis technologies like StableDiffusion, making the development of digital entities more convenient, cost-effective, and diversified.

Outlook

Nuwa Lab aims to promote the widespread adoption of digital entity protocols and the prosperous development of its ecosystem. The Lab envisions building not just a technological platform, but a dynamic ecosystem where digital entities can connect with the real world, interact, and continuously evolve. Through ongoing technological innovation and community collaboration, we believe that digital entities will play an increasingly significant role in our daily lives, ushering in a new era of interaction.

Last updated

Was this helpful?