DuckDB DuckPGQ With ODBC: Troubleshooting Guide

by TheNnagam 48 views

DuckPGQ and ODBC: A Troubleshooting Deep Dive

Hey guys, let's dive into a bit of a head-scratcher I've been wrestling with: getting DuckPGQ to play nice with ODBC, specifically from a C# application. I saw the buzz around the DuckDB graph queries and thought it'd be super cool to integrate this into my existing C# ODBC app. I'm using DuckDB 1.4.1, and the duckpgq extension installed and loaded without a hitch. The issue, however, lies in getting those graph queries to actually return data when run through ODBC.

The Setup

First off, the basics. My setup involves a C# ODBC application that runs SQL files. I'm taking advantage of DuckDB's support for graph queries using the duckpgq extension. I've been running queries from .sql files, and everything works like a charm. But the FROM GRAPH_TABLE commands aren't pulling back any results when I run them through ODBC. I'm getting an ExecuteReader:False 0 from my C# code, and I've verified this behavior through the ODBC data source admin trace file. I've also made sure the extension is installed and loaded, and the property graph is created without any apparent errors. I've used the example from the DuckDB documentation to ensure I have a baseline that should work.

The SQL File

Here's the SQL file I'm using, which is a slightly modified version of the example. This is what my C# application is trying to execute:

.echo on
-- .timer on
.conn duckdb

PRAGMA version;

INSTALL duckpgq FROM community;
LOAD duckpgq;

CREATE OR REPLACE TABLE Person AS SELECT * FROM 'https://gist.githubusercontent.com/Dtenwolde/2b02aebbed3c9638a06fda8ee0088a36/raw/8c4dc551f7344b12eaff2d1438c9da08649d00ec/person-sf0.003.csv';
CREATE OR REPLACE TABLE Person_knows_person AS SELECT * FROM 'https://gist.githubusercontent.com/Dtenwolde/81c32c9002d4059c2c3073dbca155275/raw/8b440e810a48dcaa08c07086e493ec0e2ec6b3cb/person_knows_person-sf0.003.csv';

CREATE OR REPLACE PROPERTY GRAPH snb
  VERTEX TABLES (
    Person
  )
  EDGE TABLES (
    Person_knows_person SOURCE KEY (Person1Id) REFERENCES Person (id)
                        DESTINATION KEY (Person2Id) REFERENCES Person (id)
    LABEL knows
  );
  
PRAGMA show_property_graphs;

FROM GRAPH_TABLE (snb
  MATCH (a:Person)
  COLUMNS (a.id)
);

FROM GRAPH_TABLE (snb
  MATCH (a:Person)-[k:knows]->(b:Person)
  COLUMNS (a.id, b.id)
)
LIMIT 1;

FROM GRAPH_TABLE (snb 
  MATCH p = ANY SHORTEST (a:person)-[k:knows]->{1,3}(b:Person) 
  COLUMNS (a.id, b.id, path_length(p))
) 
LIMIT 1;

FROM local_clustering_coefficient(snb, person, knows);

DROP PROPERTY GRAPH snb;

Command Line vs. ODBC

Here's where it gets interesting. When I run this exact same SQL file through the command line using duckdb test.db < .\[...]DuckPGQ_Example.sql, everything works as expected! The graph queries return results, and the property graph is created and queried successfully. The only issue I encountered on the command line was with the local_clustering_coefficient() function, which threw an error related to a missing table csr_cte. The output from the command line execution is included in the original post.

The C# Code and the ODBC Problem

The issue is with my C# ODBC program. It appears that the FROM GRAPH_TABLE syntax is the culprit. While the property graph is created without errors, the subsequent graph table queries are not returning any rows. I've double-checked that the tables Person and Person_knows_person are populated. So, the data should be there, and the graph should be buildable. The command line execution proves this. But the ODBC connection isn't seeing the results.

Code Snippet

The C# code causing the problem:

Console.WriteLine({{content}}quot;    ExecuteReader:{reader.HasRows} {reader.FieldCount}");

The reader.HasRows property is false, indicating that no rows are being returned from the GRAPH_TABLE queries. And the reader.FieldCount is 0, meaning nothing is being selected from the database. I included a complete log from the C# program. The error Catalog Error: Table with name csr_cte does not exist! also appears here and is consistent with the error from the command line execution of local_clustering_coefficient(). But this problem is happening before the execution of this line.

Possible Causes and Troubleshooting Steps

  • ODBC Driver Compatibility: Could there be a compatibility issue between the DuckDB ODBC driver and the GRAPH_TABLE syntax? I'm using version 1.4.1. The driver might not fully support the extension's functionalities. I'll need to check the DuckDB documentation and any known limitations of the ODBC driver.
  • Syntax Differences: Is there a subtle syntax difference between the command-line interface and what the ODBC driver interprets? Although the SQL looks correct, I could try simplifying the queries to see if that helps. Maybe a more basic FROM GRAPH_TABLE query to start with.
  • Transaction Issues: The ODBC connection might not be handling transactions correctly. I can experiment with committing the changes or ensuring that the queries are executed within a transaction scope.
  • Extension Loading within ODBC: Is the extension loading properly when the SQL is executed through ODBC? I already verified that the extension is loaded, but it might be worth checking again or trying to explicitly load it within the ODBC connection itself. I can verify by running PRAGMA show_property_graphs; to see if the graph is properly created before executing the GRAPH_TABLE statements.
  • Data Type Issues: Are there any data type mismatches between the C# application and the data returned by the graph queries? It's unlikely, as the example returns simple INT64 ids, but checking the data types could rule out any potential issues.

Reproduction Steps

Here's how you can reproduce this issue:

  1. Set up DuckDB: Ensure you have DuckDB 1.4.1 installed and the duckpgq extension installed and loaded.
  2. Create Tables: Create the Person and Person_knows_person tables as shown in the SQL file, using data from the provided gist links.
  3. Create Property Graph: Define the snb property graph.
  4. Run Queries: Execute the GRAPH_TABLE queries through an ODBC connection. The C# code snippet shows how I check if data is being returned.

System Information

  • OS: Windows 11
  • DuckDB Version: 1.4.1
  • DuckDB Client: C#, command line
  • Extension Loading: Community extension version

This is a bit frustrating because the command-line interface works flawlessly. If anyone has experience with DuckDB's ODBC driver and the duckpgq extension, your insights would be greatly appreciated. Any tips on debugging or troubleshooting this issue would be awesome! Thanks in advance for any help you can provide!