Trino Datasource Integration

Knowi natively integrates with Trino, enabling you to query and visualize data from multiple data sources through Trino's distributed SQL query engine. Transform and visualize your Trino data using drag & drop visualizations, combine data from multiple sources, and automate your Trino reporting within Knowi.

Overview

There are multiple approaches to integrate with your Trino cluster.

With either connectivity option, Knowi enables you to issue native Trino SQL queries to one or more catalogs to create your datasets for visualizations, then blend that data with other SQL/NoSQL sources, automate queries, and share dashboards.

UI Based Approach

Connecting

Trino Connection

  1. Log into Knowi and click on Menu > Data Sources
  2. Click on + New Datasource button
  3. Select Trino from Data Warehouses section
  4. Configure the datasource connection:

    Field Description
    Datasource Name A friendly name for your Trino connection
    Hostname/IP Hostname or IP address of your Trino coordinator
    Port Port number where Trino is listening (default: 8080)
    Catalog The data source catalog to connect to (e.g., tpch, mysql, postgresql)
    Schema The schema within the catalog (optional, but recommended)
    Username Username for authentication (optional)
    Password Password for authentication (optional)
    Properties Additional connection properties (e.g., SSL=true&SSLVerification=NONE)
  5. Click Test Connection to validate your settings

  6. Click Save to save the datasource

Note: Ensure your Trino cluster is accessible from Knowi's servers. For clusters on your internal network, use the Cloud9Agent approach.

Query

Once connected, you can query your Trino data using SQL:

Visual Builder

Trino Visual Query Builder

  1. Navigate to Queries and create a new query
  2. Select your Trino datasource
  3. Use the Visual Builder mode:
    • Select tables from the schema explorer on the left
    • Drag and drop fields to build your query
    • Apply filters, groupings, and joins visually
    • The SQL is generated automatically

Query Editor

Trino Query Editor

  1. Switch to Query Editor mode for direct SQL editing
  2. Write native Trino SQL queries:

```sql -- Query across multiple catalogs SELECT c.custkey, c.name as customer_name, o.orderdate, o.totalprice FROM tpch.sf1.customer c JOIN tpch.sf1.orders o ON c.custkey = o.custkey WHERE o.orderdate > DATE '2024-01-01' ORDER BY o.totalprice DESC LIMIT 100 ```

```sql -- Cross-catalog query SELECT m.user_id, m.username, p.total_orders, p.lastorderdate FROM mysql.users.members m JOIN postgresql.analytics.purchasesummary p ON m.userid = p.customer_id WHERE p.total_orders > 10 ```

AI Assistant Features

Data Execution Strategy

Choose between: - Direct Query: Execute queries in real-time against Trino - Cached Data: Store results for faster dashboard loading

  1. Click Preview to see query results
  2. Click Save Dataset to save for visualization

Advanced Features

Session Properties

Set session properties for query optimization:

```sql SET SESSION distributed_join = true; SET SESSION querymaxmemory = '8GB';

SELECT * FROM largetable JOIN anothertable... ```

Catalog and Schema Navigation

Cloud9Agent

For Trino clusters on your internal network, use the Cloud9Agent:

Cloud9Agent Configuration

Configure your Cloud9Agent agent.properties file:

```properties

Trino Datasource Configuration

datasource1.name=Internal Trino Cluster datasource1.type=TRINO datasource1.host=trino-coordinator.internal datasource1.port=8080 datasource1.catalog=hive datasource1.schema=analytics datasource1.username=admin datasource1.password=your_password datasource1.useSSL=true datasource1.properties=SSL=true&SSLVerification=NONE ```

Parameter Description Required
type Must be "TRINO" for Trino connections Yes
host Trino coordinator hostname Yes
port Trino port (default: 8080) Yes
catalog Default catalog to use Yes
schema Default schema within the catalog No
username Authentication username No
password Authentication password No
useSSL Enable SSL/TLS connection No
properties Additional connection properties No

Best Practices

  1. Specify Schema: Always specify a schema for better performance and to avoid scanning all schemas
  2. Use Appropriate Filters: Push down predicates to reduce data transfer
  3. Leverage Trino's Features: Use Trino's distributed processing for complex analytics
  4. Monitor Query Performance: Use Trino's Web UI to monitor query execution

Troubleshooting

Common issues and solutions:

  1. Connection Timeout: Ensure Trino is accessible from Knowi's servers or Cloud9Agent
  2. Authentication Failed: Verify username/password and authentication configuration
  3. Catalog Not Found: Check that the catalog exists and is properly configured in Trino
  4. SSL/TLS Errors: Verify SSL settings and certificate configuration

For additional help, see the Trino documentation or contact Knowi support.