Skip to main content

Knowledge Sources

The Knowledge Sources page lists every repository of information that Ept AI can search and use to generate responses. This is where administrators manage the content that powers the AI system.

Overview

Knowledge sources are the raw material used by the AI to answer questions. They can include websites, PDFs, forums, GitHub repositories, support databases, and other information repositories. Each source is assigned a unique identifier (KSID) and can be marked as confidential to control access.

Key Capabilities

Source Management

Add Knowledge Source

The "Add a knowledge source" button opens a wizard to ingest new content:

Supported Source Types:

  • Web Sites: Crawl public or private websites
  • PDF Documents: Upload or link to PDF files
  • Community Forums: Connect to discussion boards and public forums
  • GitHub Repositories: Access code repositories and documentation
  • APIs: Connect to external data sources via API
  • File Uploads: Direct file uploads for documents and media
  • Zendesk Knowledge Bases: Import articles and documentation from Zendesk
  • HubSpot Knowledge Bases: Connect to HubSpot knowledge repositories
  • Anonymized Support Tickets: Import solved support tickets with personal information removed
  • CRM Objects: Connect to CRM data like opportunities, accounts, and contacts
  • Contributed Knowledge: Knowledge added via brain-dump from AI Performance Management
  • Canned Responses: Exact-match answers for frequently asked questions

Configuration Options:

  • Source Name: Descriptive identifier for the source
  • Access URLs: Web addresses or file locations
  • Authentication: Credentials for private sources (API keys, OAuth tokens, etc.)
  • Crawl Settings: Depth and scope of content ingestion
  • Confidentiality: Mark as confidential if containing sensitive information
  • Content Filtering: Specify what content to include or exclude
  • Anonymization Settings: Configure personal information removal for support data
  • Refresh Schedule: Set automatic update frequency for dynamic sources

Sources Table

The main table displays all knowledge sources in a clean, organized layout:

ColumnDescription
NameDescriptive name for the source
TypeSource type (website, PDF, forum, Zendesk, contributed, etc.)
URLSource location or file reference
ConfidentialCheckbox indicating sensitive content
Last UpdatedWhen the source was last modified or reprocessed

Source Operations and Interactive Features

Table Management

  • Search Functionality: Search sources by name, type, URL, or content
  • Column Sorting: Sort by Name, Type, Last Updated, or Confidential status
  • Filter Options: Filter by source type, confidential status, or processing status
  • Pagination: Navigate through large source collections efficiently

Row Actions

  • View Details: Click any row to open detailed source information
  • Status Monitoring: Check ingestion and processing status with visual indicators
  • Actions Menu: Vertical-ellipsis menu with comprehensive options:
    • Edit Source: Modify source configuration and settings
    • Reprocess: Trigger manual reingestion of source content
    • Duplicate: Create a copy of the source configuration
    • Export Data: Download source content and metadata
    • View Usage: See which channels and responses use this source
    • Archive: Move to archived sources for historical reference
    • Delete: Permanently remove source (admin only)

Bulk Operations

  • Multi-Select: Select multiple sources using checkboxes
  • Bulk Reprocessing: Reprocess multiple sources simultaneously
  • Bulk Status Updates: Change status for multiple sources at once
  • Bulk Export: Download data from multiple sources in one operation
  • Bulk Delete: Remove multiple sources (with confirmation)

Knowledge Source Details

Clicking a source row opens a detail page with two main sections:

Versions Section

Track and manage different versions of your knowledge source:

Versions Table:

  • Version ID: Unique identifier for each ingestion
  • Status: Current version status with "Active" highlighting
  • Ingest Date: When the version was created
  • Processing Stats: Statistics about ingestion (pages processed, errors, etc.)
  • Actions: Manage version operations

Version Operations:

  • Make Active: Set a specific version as the current active version
  • Reprocess: Trigger reingestion of the source content
  • Archive: Remove old versions to save space
  • View Details: See detailed processing information and any errors

Items Section

View and manage individual content items extracted from the source:

Items Table:

  • Name: Item title or identifier
  • Type: Content type (page, document section, forum post, etc.)
  • URL: Direct link to the original content
  • Uses: Number of times the AI has retrieved this item for responses
  • Last Updated: Most recent modification date
  • Tags: Applied categorization tags

Item Management:

  • Search Items: Filter items by name or content using the search bar
  • Usage Analytics: Identify which items are most valuable based on retrieval frequency
  • Content Preview: View item content and metadata
  • Tag Management: Organize items with custom tags for better organization
  • Pagination: Navigate through large sets of items efficiently

Performance Insights:

  • Popular Items: Identify most frequently used content for optimization
  • Unused Content: Find items that may need improvement or removal
  • Update Tracking: Monitor when items were last modified or accessed

Source Types and Configuration

Website Sources

  • Public Sites: Configure for open web content
  • Private Sites: Set up authentication credentials
  • Crawl Depth: Control how deep to follow links
  • Content Types: Specify which file types to include
  • Update Schedule: Set automatic reprocessing frequency

Document Sources

  • PDF Files: Text extraction and processing
  • Word Documents: Content parsing and indexing
  • Presentations: Extract text from slides
  • Spreadsheets: Process tabular data appropriately
  • File Uploads: Drag-and-drop direct file uploads

Platform Integrations

  • Zendesk Knowledge Bases: Direct integration with Zendesk Help Centers
  • HubSpot Knowledge Bases: Import from HubSpot knowledge repositories
  • Forums: Connect to community forums and discussion platforms
  • GitHub Repositories: Access code repositories and documentation
  • APIs: Real-time data integration with external systems

Support and CRM Data

  • Anonymized Support Tickets: Import solved tickets with personal information removed for privacy
  • CRM Objects: Connect to opportunities, accounts, contacts, and other CRM data
  • Canned Responses: Pre-written answers for exact-match questions
  • Customer Communications: Import relevant customer interaction history

Dynamic Sources

  • Contributed Knowledge: Knowledge added through AI Performance Management brain-dump feature
  • Voice Transcripts: Knowledge captured through voice recordings and transcribed
  • User-Generated Content: Knowledge contributed by team members during AI interactions

Confidentiality Rules and Access Control

How Confidentiality Works

Confidentiality in knowledge sources operates through a cascading system that affects entire Knowledge Source Configurations (KSCs):

Key Principles:

  • Source-Level Marking: Individual knowledge sources can be marked as confidential
  • KSC Inheritance: If any knowledge source in a KSC is marked confidential, the entire KSC becomes confidential
  • Channel Restriction: Confidential KSCs can only be used by channels that are also marked as confidential
  • Access Propagation: Confidentiality restrictions automatically propagate through the entire system

Confidentiality Cascade Effect

  1. Knowledge Source: Mark individual sources as confidential (e.g., internal documentation, customer data)
  2. Knowledge Source Configuration: Any KSC containing a confidential source becomes confidential
  3. Channel Access: Only confidential channels can access confidential KSCs
  4. User Access: Users can only access confidential content through properly configured confidential channels

Managing Confidential Content

When to Mark Sources as Confidential:

  • Internal company documentation
  • Customer-specific information
  • Proprietary technical details
  • Personal or sensitive data
  • Competitive intelligence
  • Support tickets with customer information (even if anonymized)

Impact of Confidential Marking:

  • Limits which channels can access the content
  • Requires careful channel configuration to maintain access
  • Affects AI response availability in public vs private channels
  • Requires proper user permissions for access

Best Practices for Confidentiality

  1. Clear Classification: Establish clear criteria for what should be marked confidential
  2. Consistent Application: Apply confidentiality markings consistently across similar content types
  3. Channel Planning: Plan channel configurations before marking sources as confidential
  4. Access Review: Regularly review which sources need confidential treatment
  5. Documentation: Document why sources are marked confidential for future reference

Adding Knowledge from AI Performance Management

When users contribute knowledge through the AI Performance Management interface (brain-dump feature), new knowledge sources are automatically created:

Process:

  1. User Contribution: User provides knowledge via voice or text during response review
  2. Automatic Source Creation: System creates a new knowledge source labeled "contributed knowledge"
  3. Content Processing: AI processes and indexes the contributed information
  4. KSC Integration: New source is added to appropriate Knowledge Source Configurations
  5. Confidentiality Assessment: System applies appropriate confidentiality settings based on content and context

Troubleshooting

Source Not Updating

If a knowledge source isn't reflecting recent changes:

  1. Manual Reprocessing: Use the "Ingest" button to trigger an update
  2. URL Access: Verify the source URL is still accessible
  3. Authentication: Check if credentials have expired
  4. Filter Settings: Ensure URL filters aren't excluding updated content
  5. Version Check: Confirm the latest version is marked as "Active"

Missing Content

If expected content isn't appearing in AI responses:

  1. Ingestion Status: Check that the source status shows "complete"
  2. Item Verification: Look in the Items section to confirm content was extracted
  3. Confidentiality Mismatch: Verify confidential settings match channel requirements - confidential sources require confidential channels
  4. KSC Inclusion: Ensure the source is included in relevant Knowledge Source Configurations
  5. Channel Configuration: Check that the channel's KSC includes this knowledge source
  6. URL Filters: Check that filters aren't excluding important content
  7. Access Permissions: Verify user has permission to access confidential content if applicable

Processing Errors

If ingestion fails or shows errors:

  1. Source Accessibility: Verify the source URL is reachable
  2. Authentication: Check credentials for private sources
  3. Content Format: Ensure source format is supported
  4. Size Limits: Large sources may need special handling
  5. Network Issues: Consider connectivity problems

Performance Issues

If sources are slow to process:

  1. Source Size: Large sources take longer to process
  2. URL Filters: Use filters to limit scope and improve speed
  3. Content Type: Some formats are more processing-intensive
  4. Network Speed: Source location affects download time
  5. System Load: Processing capacity may affect speed

Best Practices

Content Curation

  1. Quality Over Quantity: Focus on high-quality, relevant sources
  2. Regular Updates: Keep sources current and accurate
  3. Clear Naming: Use descriptive names that indicate content scope and confidentiality level
  4. Proper Classification: Mark confidential sources appropriately and understand the cascade effects
  5. Documentation: Maintain descriptions explaining each source's purpose and confidentiality rationale
  6. Contributed Knowledge: Encourage team members to contribute knowledge through AI interactions
  7. Source Diversity: Include various source types to provide comprehensive coverage

Source Organization

  1. Logical Grouping: Group related sources in Knowledge Source Configurations
  2. Version Control: Regularly create new versions for important sources
  3. Archive Management: Remove outdated versions to save space
  4. Tag Strategy: Develop consistent tagging for easy discovery
  5. Access Control: Carefully manage confidential source access

Performance Optimization

  1. URL Filtering: Use filters to focus on relevant content only
  2. Update Frequency: Balance freshness with processing load
  3. Source Monitoring: Regular check processing status and errors
  4. Usage Analysis: Review item usage statistics to identify valuable content
  5. Cleanup: Remove unused or low-quality sources
  • Knowledge Source Configurations - Group sources for use by specific channels
  • Channels - Configure which sources are available to different interfaces
  • Responses - Monitor how knowledge sources contribute to AI responses
  • Users - Manage who can create and modify knowledge sources