I second the recommendation in Sir Whinesalot's post (which I haven't fully read yet) to look at miniKanren and microKanren. I found it extremely educational to port microKanren to OCaml a few years ago, and I think the result is somewhat more comprehensible than the original Scheme, though you'll still probably have to read the paper to understand it: http://canonical.org/~kragen/sw/dev3/mukanren.ml
The most astonishing result of miniKanren is a Scheme interpreter that you can run forwards, backwards, or both. http://webyrd.net/quines/quines.pdf demonstrates using it to generate a Scheme quine, that is, a Scheme program whose output when run is itself ("miniKanren, Live and Untagged:
Quine Generation via Relational Interpreters
(Programming Pearl)", Byrd, Holk, and Friedman).
Unlike the post, I don't think Datalog is the place to look for deep insights about logic programming. Instead, it's the place to look for deep insights about databases.
I concur that SICP 4.4 is very approachable. I once took a class that had a simple Prolog assignment, I recall we were given some building plans and had to program a path solver through the building.
I thought it was a bit too easy and I wanted to dig deeper, because just doing the task left you with a taste of "this is magic, just use these spells".
I looked at how to implement Prolog and was stumped until I found that SICP section.
So I ported it to JavaScript and gave it a Prolog-like syntax and made a web page where you could run the assignment but also exposed the inner workings, and it was one of the neatest things I've ever handed in as coursework.
Insights shminsights, the database connection is where it is at :P
(Thank you for reading the article, I also implemented microKanren before and it's insane how little code it takes to get full a logic programming engine going)
I'll note there is a really shallow version of naive datalog I rather like if you're willing to compromise on syntax and nonlinear variable use.
edge = {(1,2), (2,3)}
path = set()
for i in range(10):
# path(x,y) :- edge(x,y).
path |= edge
# path(x,z) :- edge(x,y), path(y,z).
path |= {(x,z) for x,y in edge for (y1,z) in path if y == y1}
Similarly it's pretty easy to hand write SQL in a style that looks similar and gain a lot of functionality and performance from stock database engines. https://www.philipzucker.com/tiny-sqlite-datalog/
Love it! I was trying to use python as a dsl frontend to Z3 in a different way https://github.com/philzook58/knuckledragger/blob/ecac7a568a... (it probably has to be done syntactically rather than by overloading in order to make `if` `and` etc work). Still not sure if it is ultimately a good idea. I think it's really neat how you're abusing `,` and `:` in these examples
In some parallel world, Python is the perfect tool for language-oriented programming. Any decorated function is compiled by its own DSL compiler into the internal representation of the corresponding solver, and functions communicate with each other using rich Python data structures :)
If I ever get around to writing my own at least somewhat serious Datalog engine, I definitely want to add a "translate to SQL" capability. Your work looks like the perfect inspiration, thanks!
(Also added a link to your article on what you can do with Datalog, excellent stuff, couldn't have written it better myself)
Thank you! I have been searching for something like this but for some reason couldn't find your work.
I am currently implementing a Datalog to PostgreSQL query engine at work as we want to experiment with modeling authorization rules in Datalog and then run authorization queries directly in the database. As I want to minimize the round trips to the database I use a different approach than yours and translate Datalog programs to recursive CTEs. These are a bit limited, so we have to restrict ourselves to linearly recursive Datalog programs, but for the purpose of modeling authorization rules that seems to be enough (e.g. you can still model things such as "permissions propagate from groups to group members").
My suspicion is that if you can get away with it that recursive CTEs would be more performant than doing the datalog iteration query by query. AFAIK for general rules the latter is the only option though. I had a scam in that post to do seminaive using timestamps but it was ugly. I recently came across this new feature in duckdb and was wondering if there is some way to use it to make a nice datalog https://duckdb.org/2025/05/23/using-key.html . Anything that adds expressiveness to recursive CTEs is a possibility in that direction
Yes, I think so too, if not just for the reduced number of round-trips to the database. The evaluation strategy that PostgreSQL employs for CTEs is essentially a degenerate form of seminaive evaluation where it only has to consider the delta for one subgoal in each rule due to being restricted to linearly recursive programs with no mutual recursion. The restriction to not have mutual recursion means that PostgreSQL can just stratify the CTE and compute a fixed point using the degenerate seminaive algorithm for each strata. Once a stratum is done, it can consider its idb as ground facts in the next, and so on.
I am really unhappy about the design of CTEs, and frankly I think it is a clunky hack designed by someone who was not aware of all the research on Datalog evaluation, which Ullman and others had already written excellent textbooks about at the time. CTEs are simultaneously too powerful - Turing-completeness in a declarative query language is a sure way to tell that the designers screwed up - and too restricted because being stuck in the Turing tarpit rules out so many techniques for optimization.
I didn't know about DuckDB's approach, but it seems to be a way to mark your relations as moded and exploit that during the evaluation. It appears to improve performance, but unfortunately the design it builds on is inherently flawed. I wish SQL would get a proper pure Datalog fragment in the future.
By the end of it, I implement a small type checker that, when you run it backwards (by giving the checker a type), it proceeds to enumerate programs that inhabit that type!
This window manager implemented in Prolog popped up here recently. It's really cool!
I jumped to it as a new daily driver in the hope that I'd learn some Prolog, and it's been quite the success, actually. The developer is really nice, and he's generously helped me with some basic questions and small PRs.
Definitely recommended. I have a Guix package for it if anyone's interested.
Any reading recommendations for high quality logic programming codebases?
Lately I’ve been dabbling with different Prolog implementations and Constraint Handling Rules which led me to CLIPS [0] (in Public Domain, but developed at NASA - sounds neat doesn’t it?)
It’s not very easy to get into, but it’s very fast on rule resolution and being pure C is easy to integrate. I’m trying to get smart code parsing using logic language and this seems promising. I’m also a Lisp nerd so that works for me :)
I recently implemented a version of Bret Victor's Dynamicland in the browser using miniKanren, Datalog, and WebAssembly: https://deosjr.github.io/dynamicland/
Knowing how to implement a small logic programming language from scratch really feels like a superpower sometimes.
For logic in Python this project looks pretty neat, it encodes facts as typed objects and rules as functions, then allows you to run the model using a solver like soufflé: https://py-typedlogic.github.io/
I haven't found an excuse to really use it though!
Something I’ve wondered about Datalog is whether integers can be added to the language without losing guarantees about termination of query evaluation. It seems like as soon as we add integers with successor() or strings with concat() then we can potentially create infinite relations. Is there a way to add integers or strings (well, really basic scalar operations on integer or string values) while preserving termination guarantees?
This bit at the end of the article seems to imply it’s possible, maybe with some tricks?
> We could also add support for arithmetic and composite atoms (like lists), which introduce some challenges if we wish to stay “Turing-incomplete”.
Not without a termination checker. Take a look at Twelf, it is a logic programming language and proof assistant based on the dependently typed LF logical framework. You can use general algebraic types in relations, and in general queries can be non-terminating. However, the system has a fairly simple way of checking termination using moded relations and checking that recursive calls have structurally smaller terms in all arguments.
Twelf is quite elegant, although not as powerful as other proof assistants such as Coq. Proofs in Twelf are simply logic programs which have been checked to be total and terminating.
The syntax of Twelf is a bit different from other logic languages, but just note that every rule must have a name and that instead of writing `head :- subgoal1, subgoal2, ..., subgoaln` you write `ruleName : head <- subgoal1 <- subgoal2 <- ... <- subgoaln`.
Also note that this approach only works for top-down evaluation because it still allows you to define infinite relations (e.g. the successor relation for natural numbers is infinite). Bottom-up evaluation will fail to terminate unless restricted to only derive facts that contribute to some ground query. I don't know if anyone have looked into that problem, but that seems interesting. It is probably related to the "magic sets" transformation for optimizing bottom-up queries, but as far as I understand that does not give any hard guarantees to performance, and I don't know how it would apply to this problem.
By total coincidence, today I've been looking at Datalog implementations. The Datalog(-adjacent) Soufflé tutorial says this right near the start:
> For practical usage, Soufflé extends Datalog to make it Turing-equivalent through arithmetic functors. This results in the ability of the programmer to write programs that may never terminate.
> In this article, we study the convergence of datalog when it is interpreted over an arbitrary semiring. We consider an ordered semiring, define the semantics of a datalog program as a least fixpoint in this semiring, and study the number of steps required to reach that fixpoint, if ever. We identify algebraic properties of the semiring that correspond to certain convergence properties of datalog programs. Finally, we describe a class of ordered semirings on which one can use the semi-naïve evaluation algorithm on any datalog program.
It’s quite neat since this allows them to represent linear regression, gradient decent, shortest path (APSP) within a very similar framework as regular Datalog.
They have a whole section on the necessary condition for convergence (i.e. termination).
Hey, author here. The one time I go straight to bed after posting on hackernews is the one time I get a bunch of comments hahaha.
Yes you can add support for integers in various ways where termination is still guaranteed. The simplest trick is to distinguish predicates (like pred(X, 42)) from constraints (like X > 7). Predicates have facts, constraints do not. When checking that every variable in the head of a rule appears in the body, add the condition that it appears in a predicate in the body.
So if you have a predicate like age(X:symbol, Y:int), you can use its facts to limit the set of integers under consideration. Then, if you write:
age(X, A), A + 1 >= 18.
You'd get everyone that becomes an adult next year. Fancier solutions are also possible, for example by employing techniques from finite domain constraint solving.
> Prolog is honestly kind of jank, and I’m not talking about good vintage jank like C.
Respectfully I would encourage you to consider this comment is confusing bad Prolog with all Prolog, and you are really missing out on some elegant and powerful concepts if you believe Prolog stops here.
I would love this to be an invitation for you to reevaluate modern Prolog and see what the language is really capable of.
I'll throw you right in the deep end with Prolog Definite Clause Grammars[1] and meta-interpreters[2].
There are a few things which are very special and magical to Prolog which are quite mind expanding. I hope you enjoy the resources.
DCGs are cute, I like that writing something that looks like a grammar allows you to very easily generate sentences for that grammar. That's amazing for things like property-based testing.
But why wouldn't you use Mercury instead? No extra-logical nonsense like cuts, IO done properly with linear types, much better performance, etc. DCGs aren't exclusive to Prolog, why suffer the janky bits?
And while they're not built-in to it, Soufflé Datalog supports algebraic datatypes (so you can model difference lists), and you could preprocess a DCG into the native clause form plus a length tracking argument to prevent it from creating an infinitely sized database.
I've never worked with meta-interpreters so I can't comment on those, but I will try to watch the video you posted later.
That's a great question. The answer to "why wouldn't you use Mercury instead?" should hopefully be more clear when you watch the meta-interpeter video. The syntax of Prolog is not accidental, and this is a case where it really matters. You sacrifice a lot of power by introducing things like dot access notation.
Additionally, there is nothing stopping you from writing pure monotonic Prolog, which does not involve cuts and uses pure IO.
Circling back to DCGs, the combination of DCGs plus interpreters, which can be expressed in a handful of lines of code, allows for powerful language crafting.
While constraint solvers are not unique to Prolog, CS+DCG+MI is a killer combination I've only ever seen in Prolog that when combined allow you to solve some wicked problems rather elegantly.
That being said, thanks for the reminder about Mercury, that looks like it could come in handy on some Java-only projects.
This is DCG+CS only, still quite powerful and I have adapted it for GOAP and Hierarchical Task Networks quite successfully:
https://www.metalevel.at/zurg/
Finally, I will agree with you that there is a LOT of jank Prolog, but not ALL Prolog is jank, and some Prolog is truly exquisite-- I hope you will keep an eye out for it.
Don't know if you're still reading this thread but just in case I finally got around to watching the meta-interpreter video. I can definitely see why Prolog shines for this.
Though even in that video we see a display of "jank", e.g., needing to ensure that rules don't overlap, being careful with introducing redundant choicepoints, permission errors, the trick with +, etc. in the vanilla meta-interpreter and the workarounds to get efficient tail recursion in the list-based meta-interpreter.
However, I don't think there's any way to avoid such "jank" when implementing something this powerful. The "jank" is the price to be able to do this. So fair point that this is something Prolog, and only Prolog, can do.
You made my day. I am a long time lisp developer who held a similar view of Prolog for a very long time until for various reasons I stumbled on these videos, and then realized there was a whole aspect to the language I was unaware of that went beyond the logic. Really great community developing around this stuff particularly with Scryer and Trealla that are very active if you are interested in discussing any of this further.
The fact that we were able to have this exchange was great. I appreciate your points and look forward to incorporating your work into mine!
Yes, for some kinds of operations on some kinds of data structures. The keyword/property is "monotonicity". Monotonic functions are guaranteed to terminate under fixed-point semantics.
Look into Datafun: A total functional language that generalizes Datalog. Also be sure to watch Datafun author Michael Arntzenius's Strangeloop talk.
It’s all about the terms. As soon as rules can create an infinite sequence of new terms for a single relation, e.g. by addition, you’ve got non-termination.
I mention the intuition in passing (you have an object graph with complex bidirectional and derived relationships). Any example that would truly show the benefit would be too big to show on a blog post. Treat it like the world's smartest database, that's the key.
Another example would be something like an Entity Component System. The moment it starts getting complex (i.e., you have fancy queries and joins), then you're actually implementing a really shitty relational programming engine, and you might as well just implement Datalog instead at that point and reap the benefits.
Other kinds of search problems are probably better tackled by constraint programming instead.
The only production experience I have with logic programming is OPA Rego for writing security policies (not sure it's a "pure" logic language but feels like the primary paradigm).
I found it pretty interesting for that use case, although the learning curve isn't trivial for traditional devs.
I've been reading a bit about it, and it seems easier to make goal-driven backwards chaining AI from it, like the block world example. You could in theory use that for something like a video game AI (like GOAP, Goal-Oriented Action Planning, which is based on STRIPS). Whenever I read about GOAP though, they seem to have used a graphical editor to declaratively input rules rather than a logic programming language.
Note that I'm not an expert in any of this, I've just been reading about this kind of AI recently. I haven't actually done this myself.
Ish? Is only really true if what you are programming can be seen as a search for the completion of a statement?
For an easy example to consider, what would the logical program look like that described any common fractal? https://rosettacode.org/wiki/Koch_curve#Prolog shows that... it is not necessarily a win for this idea.
For the general task asked in the OP here, I would hope you could find an example in rosettacode that shows prolog gets a good implementation. Unfortunately, I get the impression some folks prefer code golf for these more so than they do "makes the problem obvious."
I’d argue that is not the most ideal Prolog solution. More like it’s simply a recursive implementation of an imperative solution.
For fractals you’ll want to be able to recognize and generate the structures. It’s a great use case for Definite Clause Grammars (DCGs). A perfect example of this would be Triska’s Dragon Curve implementation. https://www.youtube.com/watch?v=DMdiPC1ZckI
I would agree. I was actually hoping to be proven wrong with that example. I have yet to see anything other than a constraint program that looks easier to understand in logic programming, sadly.
Adding late to this, as I didn't get to actually look at this video yesterday.
I would still agree that you can do better than the examples I'm finding, but I am not entirely clear why/how the dragon curve is honestly any better here? The prolog, notably, does not draw the curve. Just being used to generate the sequence of characters that describes it. But... that is already trivial in normal code.
Actually drawing it will get you something like this: https://rosettacode.org/wiki/Dragon_curve#Prolog. Contrast that with the Logo version and you can see how the paradigm of the programming language can make a giant difference in how the code solution looks.
This thread was the final push I needed to add logic programming to Mochi https://github.com/mochilang/mochi — a small statically typed scripting language I’m building for agents and real-time data.
I gave OpenAI Codex a single prompt with a sample like:
The only real experience I've had with logic programming is via Brachylog[1], and I enjoyed it very much.
It's a golfing language that I used on codegolf.SE, so my experience is probably very different from that of someone who used logic programming for "real" coding. But I found the experience very pleasurable: the fun in codegolfing is largely from bending your mind in unusual ways to approach the problem (and its sub-problems) differently to solve it with terser code. Brachylog added to this by providing its own set of (what to me were) unusual approaches, thus making sure that the process was always fresh and weird enough to be fun.
This is great article but it is ruined because author chose to use substack. I don't know why people keep following herd and endup publishing on substack. If you already don't know, Substack has disallowed ChatGPT to crawl anything so you cannot take AI assistance to work on the content. They got this content for free but they want to gatekeep it for profiting from it. Additionally, substack is intentionally horrible at letting allow nicely formatted pdf or readable versions downloaded to lock out the content even further. No one in their right mind should be using substack.
Folks, really, using GitHub pages with static website generator is all you need. Your content will be nice and truly freely accessible to everyone.
I love Prolog but haven't used it in ages. I should really give it a go again. Whenever I used it my brain needed some adjustment time and then switched to declarative mode. It's a really unique experience and hard to describe. It was also my first contact with immutable data.
Is implementing a Kanren and embedding it as suggested by the author really the recommended approach? Back in the day I used Sicstus mostly but tried to use SWI whenever possible (because I'm a FLOSS guy at heart). I'm asking because we went the opposite direction and used Prolog as the first language and called Java or C when needed (io, GUI). I'd describe the result as a "hot mess".
Random note: "Art of Prolog" and "Craft of Prolog" remain two of my favorite CS books to this day.
I'd be curious what the "state of the art" is these days and would love ve to hear from some folks using Prolog in the trenches.
I can't claim it's the recommended approach, just my own personal recommendation. I apologize if I made it seem like I'm some authority on the subject, I'm just some rando that dislikes SQL.
Funny enough, I made the same mistake you did back in the day. Used Prolog as the "boss" that just called back to Java as needed. My professor gave me a shitty grade because the idea was to make the opposite, a Java program that queries a Prolog database to make decisions, the Prolog part itself wasn't directly supposed to make any.
I was pissed at the time since I was showing off my Prolog skills which in a Logic Programming course I expected would give me a good grade, but that professor was 100% right. The power of logic programming is tainted when you mix it with IO and expect a specific ordering to the rule applications. Cuts are a sin.
Apologies for the wall of text below, but it's required to explain it.
It's not that cuts themselves are a problem, they're a symptom of a larger problem, of which unrestricted IO is another. They both exist because a Prolog implementation is intimately tied to a specific search strategy. One that you, as a programmer, must learn and exploit.
And because you can use cuts and IO, the search strategy must remain what it is. It's an ouroboros. A snake eating its own tail. This enforced search strategy makes Prolog a poor declarative language (despite all the neat parlor tricks you can do with it).
Why? As we climb the abstraction ladder, we relinquish control. This is a double-edged sword. We can express problems orders of magnitude more succinctly, but we must trust the machine to arrive at the solution efficiently and with limited guidance.
For example, C might be considered low level in comparison to most languages, but it is extremely high level compared to assembly.
C compilers can generate efficient machine code precisely because certain details are abstracted away: register allocation, function inlining, loop unrolling, constant-folding, dead-code elimination, etc.
These are only possible thanks to various higher-level concepts being available (variables, constants, functions, loops, conditionals, etc.) and lower-level concepts being unavailable (e.g., unrestricted goto). You have inline assembly, but it's restricted to function bodies, and you have to tell the compiler which registers you touch, or you'd prevent it from applying many optimizations entirely.
Many of the supposedly higher-level languages (like Java, C#, etc.), aren't really higher-level than C. You can express the same things in C, just slightly more verbose. Garbage Collection? You can bolt a garbage collector to C. Generators (yield)? I can implement that with macros. Generics? If void* is too cumbersome, a template file and some include tricks is a few lines of code away. They're just a pile of admittedly nice conveniences.
A pure functional language though? That could allow us to go a level higher. It's not the pattern matching or the higher order functions, however, those are still things C can express. No, the key is the referential transparency and lack of sequencing. Referential transparency means order of execution no longer matters. And if order of execution no longer matters, then we can parallelize everything.
We don't have a good strategy for automating this in a compiler yet, but the capability is there. Large scale MapReduce is only possible because we've abstracted away loops into those higher-level operations (map & reduce) calling referentially transparent functions.
Prolog didn't relinquish enough control. We can't express certain problems without resorting to weird dances to appease the poor search algorithm, same as how I might have to contort the way I write a C program so the compiler can vectorize the code.
If we relinquish enough control to arrive at Datalog (not even Turing-complete), we can make use of all the tools available to the relational model. The sort of dynamic query planning and optimization database engines use. A human can write a better query plan than a database engine, but a human cannot write a tailored query plan every few microseconds.
In Datalog I don't have to worry much about how I write my rules. I can have them be mutually recursive, I can have multiple rules with the same pattern in the head, I can write the body in a really inefficient order. A Datalog engine is free to optimize all of that out. I can't use higher-order predicates, but I can in ASP, which also has massively more efficient search than Prolog.
Even in other Turing-complete logic programming languages, like Mercury and Curry, I have a lot more freedom in how I express the problem because they have the freedom to optimize the code in a way that allows the solution to be arrived at efficiently. Curry can guarantee a solution will be arrived at if one exists, Prolog cannot.
The extra-logical control Prolog provides (like cuts) only serves as a tool to work around the poor search algorithm it uses, one it cannot change because of the control it gives. To me it's just an evolutionary dead-end, albeit one of great historical importance.
The wall of text is fine, but I am not convinced. If you really have a real-life, practical problem with cuts, then your comment fails to articulate it, because I didn't understand what it is meant to be. If it is a matter of aesthetics - higher order languages look prettier and we could, in theory (but not in practice) parallelise them - then I don't care about that. Aesthetics has no place in pragmatic considerations like the amount of resources you really need, to be able to use a language. Case in point:
>> We don't have a good strategy for automating this in a compiler yet, but
the capability is there
And it's been "there" since the 1960s, but it's still not here. Looks pretty on paper, doesn't work on metal. Close, but no cigar.
>> They both exist because a Prolog implementation is intimately tied to a
specific search strategy.
That is not right. There are different execution stategies for Prolog, other than DFS. In fact, Datalog's bottom-up evaluation with a TP-operator, is one of those: you can apply it to Polog programs just fine, except without the nice termination guarantees; but then, Church-Turing thesis.
Another execution strategy is SLG-Resolution with tabulation, a.k.a. tabling, which I mentioned in my other comment. I copy the link to SWI-Prolog's implementation here again:
Tablng is as old as nails in the Prolog community [1]. Besides which there is a ton of computer science knowledge you can bring to bear on the problem of avoiding non-termination, or unnecessary backtacking. If that is, indeed, the problem. Like I say, I don't understand from your comment what is your real-life, real-world, practical problem with using cuts.
>> Prolog didn't relinquish enough control. We can't express certain problems without resorting to weird dances to appease the poor search algorithm, same as how I might have to contort the way I write a C program so the compiler can vectorize the code.
I don't understand what that means. Prolog is FOL running on a digital computer. One is a computational paradigm, the other is a computational device. There is a mismatch between the two, unfortunately and inevitably, because the difference between theory and practice etc etc, and there are concessions that had to be made to be able to run one on top of the other. Messrs Kowalksi and Colmerauer (the man whose name not even French Computer Scientists can pronounce- true story) alighted on a set of choices that had the advantage of being the most practical possible, for the purpose of having a real language, that one can write real code in, to run on real computers; unlike, say, Haskell.
And you can do a lot more than just "write code". Prolog, because its interpreter is an implementation of SLD-Resolution, is not just a programming language but a sound and (refutation-) complete deductive inference system for FOL. And with recent advances in Inductive Logic Programming, it turns out it's also a sound and refutation complete inductive inference system, too.
Another couple of points:
DFS is not a poor choice of a search algorithm. It's an efficient one. You say ASP is "massively more efficient" than Prolog. It's not. ASP interpreters must first solve an NP-complete problem, which they only do thanks to powerful, special-purpose solvers. Those solvers are fast [2], but efficiency is a whole oher ball game (we don't have efficient algorithms for NP-complete problems). The NP-complete problem is that, because of ASP's bottom-up interpretation, the entire Herbrand base of a predicate must be fully ground. That's why ASP also has no functions [3]; so, no lists. The big advantage of Resolution, its One Neat Trick as an algorithm, is exactly that it doesn't have to ground the entire Herbrand base before finding a proof, if one exists. That's thanks to unification. More to the point, as Resolution proceeds, it prunes the search space of proofs, just by discarding branches starting at goals that fail unification. ASP can't do that [4,5].
See, every time anyone has tried to "fix" Prolog's not-fully-100%-declarativeness, they've just broken it.
Or, as I like to say about logic programming in general; sound, complete, efficient: choose two.
[3] Because functions turn a predicate's Herbrand base infinite, and then it takes an infinite amount of time to ground it.
[4] Well, ish. Standard, bottom-up ASP can't but there are different ways to evaluate ASP ... programs, too. For instance, you can evaluate them top-down, with unification, just like in Prolog. See sCASP (first link I could find; dig deeper):
[5] It's not just unification pruning the search space of proofs, either. It's that the cardinality of the resolution closure of a set of Horn clauses decreases monotonically until either the empty clause is constructed by resolution, or... well, infinity.
I was thinking of an idea very similar to this some weeks ago where you define the "rules" that structure your program and I think prolog is the one!
I'll looking into getting a taste this weekend
Some time ago I worked on cockroachdb and I was working on implementing planning for complex online schema changes.
We really wanted a model that could convincingly handle and reasonably schedule arbitrary combinations of schema change statements that are valid in Postgres. Unlike mysql postgres offers transactional schema changes. Unlike Postgres, cockroach strives to implement online schema changes in a protocol inspired by f1 [0]. Also, you want to make sure you can safely roll back (until you’ve reached the point where you know it can’t fail, then only metadata updates are allowed).
The model we came up with was to decompose all things that can possibly change into “elements” [1] and each element had a schedule of state transitions that move the element through a sequence of states from public to absent or vice versa [2]. Each state transitions has operations [3].
Anyway, you end up wanting to define rules that say that certain element states have to be entered before other if the elements are related in some way. Or perhaps some transitions should happen at the same time. To express these rules I created a little datalog-like framework I called rel [4]. This lets you embed in go a rules engine that then you can add indexes to so that you can have sufficiently efficient implementation and know that all your lookups are indexed statically. You write the rules in Go [5]. To be honest it could be more ergonomic.
The rules are written in Go but for testing and visibility they produce a datomic-inspired format [6]. There’s a lot of rules now!
The internal implementation isn’t too far off from the search implementation presented here [7]. Here’s unify [8]. The thing has some indexes and index selection for acceleration. It also has inverted indexes for set containment queries.
It was fun to make a little embedded logic language and to have had a reason to!
i tried writing a tiny logic engine once after reading the SICP chapter and yeah, the syntax part was easy to mimic but the actual backtracking logic hit me harder than expected. what helped was thinking of it less like solving and more like building a lazy search tree. once that clicked, i stopped trying to force control flow and just let the tree expand. one thing i didn’t see many mention handling stateful part or side effects blows up fast. even printing during backtracking messes the whole thing. most tutorials avoid that part completely
yeah agree. keeping side effects in host lang makes life easier. tried forcing it in prolog once and instantly regretted. miniKanren with clean boundaries felt way more maintainable
>> The main issue is that Prolog is not truly declarative; how you write the rules has a major impact on how the interpreter behaves. You might end up with repeated answers for a query or it might enter an infinite loop. Prolog also allows IO, so the order in which things get executed is critical. It’s Turing-complete, for better or worse.
Yeah, sounds like a real show-stopper. Prolog is not 100% declarative, which is really what every programmer wants from a programming language.
Better use a language like C# or Java, or Python, which are 0% declarative. Or try to implement datalog (only not really) so that you can't even use lists because that's what real-word programmers are looking for in a real-world programming language: 100% declarative, 0% real-world uses.
Ah well, at least the author has heard of SLD-Resolution which is more than one can say about the authors of SICP. But they haven't heard of SLG Resolution with tabulation (a.k.a. "tabling") which is a way to execute Prolog programs while preserving the Turing-completeness but without getting stuck in left-recursions. Among other things.
>> Prolog also allows IO, so the order in which things get executed is critical.
I paid €7 to make this post because I'm on a boat in the middle of the Adriatic sea and my Vodafone contract is AWOL, so listen closely: no, Prolog's IO is not what makes it Turing Complete. That comes from definite logic, i.e. a fragment of the First Order Predicate Calculus (a.k.a. First Order Logic) restricted to formulae in clausal form with at most one positive literal a.k.a Horn clauses, that nevertheless retains the full expressivity of full FOL. Definite logic is semi-decidable and when evaluated by Resolution, as SLD-Resolution (restricted to definite program clauses and Horn goals), it is refutation-complete (and sound, to boot). Meaning, in summary, that any true sentence can be proved true but if a sentence is false then it cannot always be proved false, by SLD-Resolution. So far, so theoretically good.
What Prolog does which is really dangerous and makes the order of execution "critical" is that it allows the programmer to edit the program database i.e. the database of facts and rules, in other words, the program itself, programmatically, i.e from the program itself. This means that not all your program's state is known before the program runs and database operations are completed. That sucks hairy balls but it is also a very useful mechanism that requires experience and understanding of the language, its core concepts, and programming in general, to use right.
But Prolog's Turing completeness has nothing to do with IO, or database ops. That was not worth €7.
Not really logic programming, but a while ago I made this, also in Python:
https://github.com/ariroffe/logics (mainly for educational purposes).
Parts of the implementation kind of reminded me of it.
I second the recommendation in Sir Whinesalot's post (which I haven't fully read yet) to look at miniKanren and microKanren. I found it extremely educational to port microKanren to OCaml a few years ago, and I think the result is somewhat more comprehensible than the original Scheme, though you'll still probably have to read the paper to understand it: http://canonical.org/~kragen/sw/dev3/mukanren.ml
The most astonishing result of miniKanren is a Scheme interpreter that you can run forwards, backwards, or both. http://webyrd.net/quines/quines.pdf demonstrates using it to generate a Scheme quine, that is, a Scheme program whose output when run is itself ("miniKanren, Live and Untagged: Quine Generation via Relational Interpreters (Programming Pearl)", Byrd, Holk, and Friedman).
§4.4 of SICP http://sarabander.github.io/sicp/html/4_002e4.xhtml also has an implementation of logic programming in Scheme which is extremely approachable.
Unlike the post, I don't think Datalog is the place to look for deep insights about logic programming. Instead, it's the place to look for deep insights about databases.
I concur that SICP 4.4 is very approachable. I once took a class that had a simple Prolog assignment, I recall we were given some building plans and had to program a path solver through the building. I thought it was a bit too easy and I wanted to dig deeper, because just doing the task left you with a taste of "this is magic, just use these spells".
I looked at how to implement Prolog and was stumped until I found that SICP section.
So I ported it to JavaScript and gave it a Prolog-like syntax and made a web page where you could run the assignment but also exposed the inner workings, and it was one of the neatest things I've ever handed in as coursework.
Insights shminsights, the database connection is where it is at :P
(Thank you for reading the article, I also implemented microKanren before and it's insane how little code it takes to get full a logic programming engine going)
Among SICP, https://t3x.org/amk/index.html
Same approach. I think an older version of the book it's freely available, or maybe the one on Scheme itself.
Scheme being homoiconic makes far easier to create quines.
Nice!
I'll note there is a really shallow version of naive datalog I rather like if you're willing to compromise on syntax and nonlinear variable use.
Similarly it's pretty easy to hand write SQL in a style that looks similar and gain a lot of functionality and performance from stock database engines. https://www.philipzucker.com/tiny-sqlite-datalog/I wrote a small datalog from the Z3 AST to sqlite recently along these lines https://github.com/philzook58/knuckledragger/blob/main/kdrag...
Or you can use Python AST + Z3 :) Here is a toy implementation:
https://github.com/true-grue/python-dsls/blob/main/datalog/d...
Love it! I was trying to use python as a dsl frontend to Z3 in a different way https://github.com/philzook58/knuckledragger/blob/ecac7a568a... (it probably has to be done syntactically rather than by overloading in order to make `if` `and` etc work). Still not sure if it is ultimately a good idea. I think it's really neat how you're abusing `,` and `:` in these examples
In some parallel world, Python is the perfect tool for language-oriented programming. Any decorated function is compiled by its own DSL compiler into the internal representation of the corresponding solver, and functions communicate with each other using rich Python data structures :)
If I ever get around to writing my own at least somewhat serious Datalog engine, I definitely want to add a "translate to SQL" capability. Your work looks like the perfect inspiration, thanks!
(Also added a link to your article on what you can do with Datalog, excellent stuff, couldn't have written it better myself)
Thanks! I wouldn't so much say it was written so much as it was vomited out in a year of enthusiasm, but I'm glad it has some value.
Thank you! I have been searching for something like this but for some reason couldn't find your work.
I am currently implementing a Datalog to PostgreSQL query engine at work as we want to experiment with modeling authorization rules in Datalog and then run authorization queries directly in the database. As I want to minimize the round trips to the database I use a different approach than yours and translate Datalog programs to recursive CTEs. These are a bit limited, so we have to restrict ourselves to linearly recursive Datalog programs, but for the purpose of modeling authorization rules that seems to be enough (e.g. you can still model things such as "permissions propagate from groups to group members").
My suspicion is that if you can get away with it that recursive CTEs would be more performant than doing the datalog iteration query by query. AFAIK for general rules the latter is the only option though. I had a scam in that post to do seminaive using timestamps but it was ugly. I recently came across this new feature in duckdb and was wondering if there is some way to use it to make a nice datalog https://duckdb.org/2025/05/23/using-key.html . Anything that adds expressiveness to recursive CTEs is a possibility in that direction
Yes, I think so too, if not just for the reduced number of round-trips to the database. The evaluation strategy that PostgreSQL employs for CTEs is essentially a degenerate form of seminaive evaluation where it only has to consider the delta for one subgoal in each rule due to being restricted to linearly recursive programs with no mutual recursion. The restriction to not have mutual recursion means that PostgreSQL can just stratify the CTE and compute a fixed point using the degenerate seminaive algorithm for each strata. Once a stratum is done, it can consider its idb as ground facts in the next, and so on.
I am really unhappy about the design of CTEs, and frankly I think it is a clunky hack designed by someone who was not aware of all the research on Datalog evaluation, which Ullman and others had already written excellent textbooks about at the time. CTEs are simultaneously too powerful - Turing-completeness in a declarative query language is a sure way to tell that the designers screwed up - and too restricted because being stuck in the Turing tarpit rules out so many techniques for optimization.
I didn't know about DuckDB's approach, but it seems to be a way to mark your relations as moded and exploit that during the evaluation. It appears to improve performance, but unfortunately the design it builds on is inherently flawed. I wish SQL would get a proper pure Datalog fragment in the future.
What does CTE stand for, and how do I research it ?
Common Table Expression, a SQL concept that enables more expressive programming with SQL queries. They are introduced using WITH ...
This requires some discrete math knowledge?
That's exciting!
I did a detailed write-up of implementing miniKanren here:
https://codeberg.org/ashton314/microKanren
By the end of it, I implement a small type checker that, when you run it backwards (by giving the checker a type), it proceeds to enumerate programs that inhabit that type!
Isn't that amazing‽ I wonder if you could guide its search with an LLM...
There is some research work I’m aware of that’s trying to make type-safe LLM generation a thing.
Is that research publically available, and where ?
This paper appeared at PLDI '25: https://arxiv.org/abs/2504.09246
Discussed on HN at the time: https://news.ycombinator.com/item?id=43978357
Not sure if this is what you mean, but 'inductive logic programming' and more generally 'program synthesis' are fairly active research areas.
https://github.com/Seeker04/plwm
This window manager implemented in Prolog popped up here recently. It's really cool!
I jumped to it as a new daily driver in the hope that I'd learn some Prolog, and it's been quite the success, actually. The developer is really nice, and he's generously helped me with some basic questions and small PRs.
Definitely recommended. I have a Guix package for it if anyone's interested.
Any reading recommendations for high quality logic programming codebases?
You should publish the Guix package somewhere.
Lately I’ve been dabbling with different Prolog implementations and Constraint Handling Rules which led me to CLIPS [0] (in Public Domain, but developed at NASA - sounds neat doesn’t it?)
It’s not very easy to get into, but it’s very fast on rule resolution and being pure C is easy to integrate. I’m trying to get smart code parsing using logic language and this seems promising. I’m also a Lisp nerd so that works for me :)
[0]: https://www.clipsrules.net/
CLIPS is also used by the Magic the Gathering Arena game engine to implement part of the game logic: https://magic.wizards.com/en/news/mtg-arena/on-whiteboards-n...
https://ryjo.codes/ has done a lot of work with it, and made a course!
Whoa, awesome! Thanks for the link.
I recently implemented a version of Bret Victor's Dynamicland in the browser using miniKanren, Datalog, and WebAssembly: https://deosjr.github.io/dynamicland/
Knowing how to implement a small logic programming language from scratch really feels like a superpower sometimes.
For logic in Python this project looks pretty neat, it encodes facts as typed objects and rules as functions, then allows you to run the model using a solver like soufflé: https://py-typedlogic.github.io/
I haven't found an excuse to really use it though!
Something I’ve wondered about Datalog is whether integers can be added to the language without losing guarantees about termination of query evaluation. It seems like as soon as we add integers with successor() or strings with concat() then we can potentially create infinite relations. Is there a way to add integers or strings (well, really basic scalar operations on integer or string values) while preserving termination guarantees?
This bit at the end of the article seems to imply it’s possible, maybe with some tricks?
> We could also add support for arithmetic and composite atoms (like lists), which introduce some challenges if we wish to stay “Turing-incomplete”.
Not without a termination checker. Take a look at Twelf, it is a logic programming language and proof assistant based on the dependently typed LF logical framework. You can use general algebraic types in relations, and in general queries can be non-terminating. However, the system has a fairly simple way of checking termination using moded relations and checking that recursive calls have structurally smaller terms in all arguments.
Twelf is quite elegant, although not as powerful as other proof assistants such as Coq. Proofs in Twelf are simply logic programs which have been checked to be total and terminating.
Edit: Here's a link to a short page in the manual which shows how termination checking works: https://twelf.org/wiki/percent-terminates/
The syntax of Twelf is a bit different from other logic languages, but just note that every rule must have a name and that instead of writing `head :- subgoal1, subgoal2, ..., subgoaln` you write `ruleName : head <- subgoal1 <- subgoal2 <- ... <- subgoaln`.
Also note that this approach only works for top-down evaluation because it still allows you to define infinite relations (e.g. the successor relation for natural numbers is infinite). Bottom-up evaluation will fail to terminate unless restricted to only derive facts that contribute to some ground query. I don't know if anyone have looked into that problem, but that seems interesting. It is probably related to the "magic sets" transformation for optimizing bottom-up queries, but as far as I understand that does not give any hard guarantees to performance, and I don't know how it would apply to this problem.
By total coincidence, today I've been looking at Datalog implementations. The Datalog(-adjacent) Soufflé tutorial says this right near the start:
> For practical usage, Soufflé extends Datalog to make it Turing-equivalent through arithmetic functors. This results in the ability of the programmer to write programs that may never terminate.
https://souffle-lang.github.io/tutorial
Here’s a quite recent interesting paper about this: https://dl.acm.org/doi/abs/10.1145/3643027
> In this article, we study the convergence of datalog when it is interpreted over an arbitrary semiring. We consider an ordered semiring, define the semantics of a datalog program as a least fixpoint in this semiring, and study the number of steps required to reach that fixpoint, if ever. We identify algebraic properties of the semiring that correspond to certain convergence properties of datalog programs. Finally, we describe a class of ordered semirings on which one can use the semi-naïve evaluation algorithm on any datalog program.
It’s quite neat since this allows them to represent linear regression, gradient decent, shortest path (APSP) within a very similar framework as regular Datalog.
They have a whole section on the necessary condition for convergence (i.e. termination).
Hey, author here. The one time I go straight to bed after posting on hackernews is the one time I get a bunch of comments hahaha.
Yes you can add support for integers in various ways where termination is still guaranteed. The simplest trick is to distinguish predicates (like pred(X, 42)) from constraints (like X > 7). Predicates have facts, constraints do not. When checking that every variable in the head of a rule appears in the body, add the condition that it appears in a predicate in the body.
So if you have a predicate like age(X:symbol, Y:int), you can use its facts to limit the set of integers under consideration. Then, if you write:
age(X, A), A + 1 >= 18.
You'd get everyone that becomes an adult next year. Fancier solutions are also possible, for example by employing techniques from finite domain constraint solving.
> Prolog is honestly kind of jank, and I’m not talking about good vintage jank like C.
Respectfully I would encourage you to consider this comment is confusing bad Prolog with all Prolog, and you are really missing out on some elegant and powerful concepts if you believe Prolog stops here.
I would love this to be an invitation for you to reevaluate modern Prolog and see what the language is really capable of.
I'll throw you right in the deep end with Prolog Definite Clause Grammars[1] and meta-interpreters[2].
There are a few things which are very special and magical to Prolog which are quite mind expanding. I hope you enjoy the resources.
[1]: https://youtu.be/CvLsVfq6cks
[2]: https://youtu.be/nmBkU-l1zyc
DCGs are cute, I like that writing something that looks like a grammar allows you to very easily generate sentences for that grammar. That's amazing for things like property-based testing.
But why wouldn't you use Mercury instead? No extra-logical nonsense like cuts, IO done properly with linear types, much better performance, etc. DCGs aren't exclusive to Prolog, why suffer the janky bits?
And while they're not built-in to it, Soufflé Datalog supports algebraic datatypes (so you can model difference lists), and you could preprocess a DCG into the native clause form plus a length tracking argument to prevent it from creating an infinitely sized database.
I've never worked with meta-interpreters so I can't comment on those, but I will try to watch the video you posted later.
That's a great question. The answer to "why wouldn't you use Mercury instead?" should hopefully be more clear when you watch the meta-interpeter video. The syntax of Prolog is not accidental, and this is a case where it really matters. You sacrifice a lot of power by introducing things like dot access notation.
Additionally, there is nothing stopping you from writing pure monotonic Prolog, which does not involve cuts and uses pure IO.
Circling back to DCGs, the combination of DCGs plus interpreters, which can be expressed in a handful of lines of code, allows for powerful language crafting.
While constraint solvers are not unique to Prolog, CS+DCG+MI is a killer combination I've only ever seen in Prolog that when combined allow you to solve some wicked problems rather elegantly.
That being said, thanks for the reminder about Mercury, that looks like it could come in handy on some Java-only projects.
This is DCG+CS only, still quite powerful and I have adapted it for GOAP and Hierarchical Task Networks quite successfully: https://www.metalevel.at/zurg/
Finally, I will agree with you that there is a LOT of jank Prolog, but not ALL Prolog is jank, and some Prolog is truly exquisite-- I hope you will keep an eye out for it.
Don't know if you're still reading this thread but just in case I finally got around to watching the meta-interpreter video. I can definitely see why Prolog shines for this.
Though even in that video we see a display of "jank", e.g., needing to ensure that rules don't overlap, being careful with introducing redundant choicepoints, permission errors, the trick with +, etc. in the vanilla meta-interpreter and the workarounds to get efficient tail recursion in the list-based meta-interpreter.
However, I don't think there's any way to avoid such "jank" when implementing something this powerful. The "jank" is the price to be able to do this. So fair point that this is something Prolog, and only Prolog, can do.
You made my day. I am a long time lisp developer who held a similar view of Prolog for a very long time until for various reasons I stumbled on these videos, and then realized there was a whole aspect to the language I was unaware of that went beyond the logic. Really great community developing around this stuff particularly with Scryer and Trealla that are very active if you are interested in discussing any of this further.
The fact that we were able to have this exchange was great. I appreciate your points and look forward to incorporating your work into mine!
Thanks, this is really helpful! And great article.
Yes, for some kinds of operations on some kinds of data structures. The keyword/property is "monotonicity". Monotonic functions are guaranteed to terminate under fixed-point semantics.
Look into Datafun: A total functional language that generalizes Datalog. Also be sure to watch Datafun author Michael Arntzenius's Strangeloop talk.
https://www.rntz.net/datafun/
https://www.youtube.com/watch?v=gC295d3V9gE
It’s all about the terms. As soon as rules can create an infinite sequence of new terms for a single relation, e.g. by addition, you’ve got non-termination.
I think it would be really impactful to start with a problem and describe how logic programming solves that problem better than the other paradigms.
I mention the intuition in passing (you have an object graph with complex bidirectional and derived relationships). Any example that would truly show the benefit would be too big to show on a blog post. Treat it like the world's smartest database, that's the key.
Another example would be something like an Entity Component System. The moment it starts getting complex (i.e., you have fancy queries and joins), then you're actually implementing a really shitty relational programming engine, and you might as well just implement Datalog instead at that point and reap the benefits.
Other kinds of search problems are probably better tackled by constraint programming instead.
The only production experience I have with logic programming is OPA Rego for writing security policies (not sure it's a "pure" logic language but feels like the primary paradigm).
I found it pretty interesting for that use case, although the learning curve isn't trivial for traditional devs.
https://www.openpolicyagent.org/
I've been reading a bit about it, and it seems easier to make goal-driven backwards chaining AI from it, like the block world example. You could in theory use that for something like a video game AI (like GOAP, Goal-Oriented Action Planning, which is based on STRIPS). Whenever I read about GOAP though, they seem to have used a graphical editor to declaratively input rules rather than a logic programming language.
Note that I'm not an expert in any of this, I've just been reading about this kind of AI recently. I haven't actually done this myself.
Generally speaking, the advantage of logic programming is that it's (more) declarative: you describe the problem and it derives a solution.
Ish? Is only really true if what you are programming can be seen as a search for the completion of a statement?
For an easy example to consider, what would the logical program look like that described any common fractal? https://rosettacode.org/wiki/Koch_curve#Prolog shows that... it is not necessarily a win for this idea.
For the general task asked in the OP here, I would hope you could find an example in rosettacode that shows prolog gets a good implementation. Unfortunately, I get the impression some folks prefer code golf for these more so than they do "makes the problem obvious."
I’d argue that is not the most ideal Prolog solution. More like it’s simply a recursive implementation of an imperative solution.
For fractals you’ll want to be able to recognize and generate the structures. It’s a great use case for Definite Clause Grammars (DCGs). A perfect example of this would be Triska’s Dragon Curve implementation. https://www.youtube.com/watch?v=DMdiPC1ZckI
I would agree. I was actually hoping to be proven wrong with that example. I have yet to see anything other than a constraint program that looks easier to understand in logic programming, sadly.
Adding late to this, as I didn't get to actually look at this video yesterday.
I would still agree that you can do better than the examples I'm finding, but I am not entirely clear why/how the dragon curve is honestly any better here? The prolog, notably, does not draw the curve. Just being used to generate the sequence of characters that describes it. But... that is already trivial in normal code.
Actually drawing it will get you something like this: https://rosettacode.org/wiki/Dragon_curve#Prolog. Contrast that with the Logo version and you can see how the paradigm of the programming language can make a giant difference in how the code solution looks.
This thread was the final push I needed to add logic programming to Mochi https://github.com/mochilang/mochi — a small statically typed scripting language I’m building for agents and real-time data.
I gave OpenAI Codex a single prompt with a sample like:
And it generated a working Datalog engine in Go with: Full thinking process: https://chatgpt.com/s/cd_684d3e3c59c08191b20c49ad97b66e01Total implementation was ~250 LOC. Genuinely amazed how effective the LLM was at helping bootstrap a real logic layer in one go.
The PR is here https://github.com/mochilang/mochi/pull/616
The only real experience I've had with logic programming is via Brachylog[1], and I enjoyed it very much.
It's a golfing language that I used on codegolf.SE, so my experience is probably very different from that of someone who used logic programming for "real" coding. But I found the experience very pleasurable: the fun in codegolfing is largely from bending your mind in unusual ways to approach the problem (and its sub-problems) differently to solve it with terser code. Brachylog added to this by providing its own set of (what to me were) unusual approaches, thus making sure that the process was always fresh and weird enough to be fun.
[1] https://github.com/JCumin/Brachylog/wiki
This is great article but it is ruined because author chose to use substack. I don't know why people keep following herd and endup publishing on substack. If you already don't know, Substack has disallowed ChatGPT to crawl anything so you cannot take AI assistance to work on the content. They got this content for free but they want to gatekeep it for profiting from it. Additionally, substack is intentionally horrible at letting allow nicely formatted pdf or readable versions downloaded to lock out the content even further. No one in their right mind should be using substack.
Folks, really, using GitHub pages with static website generator is all you need. Your content will be nice and truly freely accessible to everyone.
> Substack has disallowed ChatGPT to crawl anything so you cannot take AI assistance to work on the content
Is that really a criticism?
Microkanren et al are nice! But it is becoming sort of a mono-culture where other approaches get ignored.
Before Microkanren, the rite of passage for logic programming was to build a Prolog using Warren's Abstract Machine (WAM).
https://direct.mit.edu/books/monograph/4253/Warren-s-Abstrac...
Well, the blog post has a Datalog implementation, so there is that.
I love Prolog but haven't used it in ages. I should really give it a go again. Whenever I used it my brain needed some adjustment time and then switched to declarative mode. It's a really unique experience and hard to describe. It was also my first contact with immutable data.
Is implementing a Kanren and embedding it as suggested by the author really the recommended approach? Back in the day I used Sicstus mostly but tried to use SWI whenever possible (because I'm a FLOSS guy at heart). I'm asking because we went the opposite direction and used Prolog as the first language and called Java or C when needed (io, GUI). I'd describe the result as a "hot mess".
Random note: "Art of Prolog" and "Craft of Prolog" remain two of my favorite CS books to this day.
I'd be curious what the "state of the art" is these days and would love ve to hear from some folks using Prolog in the trenches.
I can't claim it's the recommended approach, just my own personal recommendation. I apologize if I made it seem like I'm some authority on the subject, I'm just some rando that dislikes SQL.
Funny enough, I made the same mistake you did back in the day. Used Prolog as the "boss" that just called back to Java as needed. My professor gave me a shitty grade because the idea was to make the opposite, a Java program that queries a Prolog database to make decisions, the Prolog part itself wasn't directly supposed to make any.
I was pissed at the time since I was showing off my Prolog skills which in a Logic Programming course I expected would give me a good grade, but that professor was 100% right. The power of logic programming is tainted when you mix it with IO and expect a specific ordering to the rule applications. Cuts are a sin.
Cuts are perfectly fine. What's the problem cuts have caused you?
Apologies for the wall of text below, but it's required to explain it.
It's not that cuts themselves are a problem, they're a symptom of a larger problem, of which unrestricted IO is another. They both exist because a Prolog implementation is intimately tied to a specific search strategy. One that you, as a programmer, must learn and exploit.
And because you can use cuts and IO, the search strategy must remain what it is. It's an ouroboros. A snake eating its own tail. This enforced search strategy makes Prolog a poor declarative language (despite all the neat parlor tricks you can do with it).
Why? As we climb the abstraction ladder, we relinquish control. This is a double-edged sword. We can express problems orders of magnitude more succinctly, but we must trust the machine to arrive at the solution efficiently and with limited guidance.
For example, C might be considered low level in comparison to most languages, but it is extremely high level compared to assembly.
C compilers can generate efficient machine code precisely because certain details are abstracted away: register allocation, function inlining, loop unrolling, constant-folding, dead-code elimination, etc.
These are only possible thanks to various higher-level concepts being available (variables, constants, functions, loops, conditionals, etc.) and lower-level concepts being unavailable (e.g., unrestricted goto). You have inline assembly, but it's restricted to function bodies, and you have to tell the compiler which registers you touch, or you'd prevent it from applying many optimizations entirely.
Many of the supposedly higher-level languages (like Java, C#, etc.), aren't really higher-level than C. You can express the same things in C, just slightly more verbose. Garbage Collection? You can bolt a garbage collector to C. Generators (yield)? I can implement that with macros. Generics? If void* is too cumbersome, a template file and some include tricks is a few lines of code away. They're just a pile of admittedly nice conveniences.
A pure functional language though? That could allow us to go a level higher. It's not the pattern matching or the higher order functions, however, those are still things C can express. No, the key is the referential transparency and lack of sequencing. Referential transparency means order of execution no longer matters. And if order of execution no longer matters, then we can parallelize everything.
We don't have a good strategy for automating this in a compiler yet, but the capability is there. Large scale MapReduce is only possible because we've abstracted away loops into those higher-level operations (map & reduce) calling referentially transparent functions.
Prolog didn't relinquish enough control. We can't express certain problems without resorting to weird dances to appease the poor search algorithm, same as how I might have to contort the way I write a C program so the compiler can vectorize the code.
If we relinquish enough control to arrive at Datalog (not even Turing-complete), we can make use of all the tools available to the relational model. The sort of dynamic query planning and optimization database engines use. A human can write a better query plan than a database engine, but a human cannot write a tailored query plan every few microseconds.
In Datalog I don't have to worry much about how I write my rules. I can have them be mutually recursive, I can have multiple rules with the same pattern in the head, I can write the body in a really inefficient order. A Datalog engine is free to optimize all of that out. I can't use higher-order predicates, but I can in ASP, which also has massively more efficient search than Prolog.
Even in other Turing-complete logic programming languages, like Mercury and Curry, I have a lot more freedom in how I express the problem because they have the freedom to optimize the code in a way that allows the solution to be arrived at efficiently. Curry can guarantee a solution will be arrived at if one exists, Prolog cannot.
The extra-logical control Prolog provides (like cuts) only serves as a tool to work around the poor search algorithm it uses, one it cannot change because of the control it gives. To me it's just an evolutionary dead-end, albeit one of great historical importance.
The wall of text is fine, but I am not convinced. If you really have a real-life, practical problem with cuts, then your comment fails to articulate it, because I didn't understand what it is meant to be. If it is a matter of aesthetics - higher order languages look prettier and we could, in theory (but not in practice) parallelise them - then I don't care about that. Aesthetics has no place in pragmatic considerations like the amount of resources you really need, to be able to use a language. Case in point:
>> We don't have a good strategy for automating this in a compiler yet, but the capability is there
And it's been "there" since the 1960s, but it's still not here. Looks pretty on paper, doesn't work on metal. Close, but no cigar.
>> They both exist because a Prolog implementation is intimately tied to a specific search strategy.
That is not right. There are different execution stategies for Prolog, other than DFS. In fact, Datalog's bottom-up evaluation with a TP-operator, is one of those: you can apply it to Polog programs just fine, except without the nice termination guarantees; but then, Church-Turing thesis.
Another execution strategy is SLG-Resolution with tabulation, a.k.a. tabling, which I mentioned in my other comment. I copy the link to SWI-Prolog's implementation here again:
https://www.swi-prolog.org/pldoc/man?section=tabling
Tablng is as old as nails in the Prolog community [1]. Besides which there is a ton of computer science knowledge you can bring to bear on the problem of avoiding non-termination, or unnecessary backtacking. If that is, indeed, the problem. Like I say, I don't understand from your comment what is your real-life, real-world, practical problem with using cuts.
>> Prolog didn't relinquish enough control. We can't express certain problems without resorting to weird dances to appease the poor search algorithm, same as how I might have to contort the way I write a C program so the compiler can vectorize the code.
I don't understand what that means. Prolog is FOL running on a digital computer. One is a computational paradigm, the other is a computational device. There is a mismatch between the two, unfortunately and inevitably, because the difference between theory and practice etc etc, and there are concessions that had to be made to be able to run one on top of the other. Messrs Kowalksi and Colmerauer (the man whose name not even French Computer Scientists can pronounce- true story) alighted on a set of choices that had the advantage of being the most practical possible, for the purpose of having a real language, that one can write real code in, to run on real computers; unlike, say, Haskell.
And you can do a lot more than just "write code". Prolog, because its interpreter is an implementation of SLD-Resolution, is not just a programming language but a sound and (refutation-) complete deductive inference system for FOL. And with recent advances in Inductive Logic Programming, it turns out it's also a sound and refutation complete inductive inference system, too.
Another couple of points:
DFS is not a poor choice of a search algorithm. It's an efficient one. You say ASP is "massively more efficient" than Prolog. It's not. ASP interpreters must first solve an NP-complete problem, which they only do thanks to powerful, special-purpose solvers. Those solvers are fast [2], but efficiency is a whole oher ball game (we don't have efficient algorithms for NP-complete problems). The NP-complete problem is that, because of ASP's bottom-up interpretation, the entire Herbrand base of a predicate must be fully ground. That's why ASP also has no functions [3]; so, no lists. The big advantage of Resolution, its One Neat Trick as an algorithm, is exactly that it doesn't have to ground the entire Herbrand base before finding a proof, if one exists. That's thanks to unification. More to the point, as Resolution proceeds, it prunes the search space of proofs, just by discarding branches starting at goals that fail unification. ASP can't do that [4,5].
See, every time anyone has tried to "fix" Prolog's not-fully-100%-declarativeness, they've just broken it.
Or, as I like to say about logic programming in general; sound, complete, efficient: choose two.
_______________
[1] OLD-Resolution with Tabulation; Tamaki and Sato, 1986 https://dl.acm.org/doi/10.5555/12069.12075
[2] And cool.
[3] Because functions turn a predicate's Herbrand base infinite, and then it takes an infinite amount of time to ground it.
[4] Well, ish. Standard, bottom-up ASP can't but there are different ways to evaluate ASP ... programs, too. For instance, you can evaluate them top-down, with unification, just like in Prolog. See sCASP (first link I could find; dig deeper):
http://platon.etsii.urjc.es/~jarias/slides/gde21-tutorial.pd...
[5] It's not just unification pruning the search space of proofs, either. It's that the cardinality of the resolution closure of a set of Horn clauses decreases monotonically until either the empty clause is constructed by resolution, or... well, infinity.
I was thinking of an idea very similar to this some weeks ago where you define the "rules" that structure your program and I think prolog is the one! I'll looking into getting a taste this weekend
Prolog seems cursed to be forgotten and re-discovered in a never ending cycle.
Some time ago I worked on cockroachdb and I was working on implementing planning for complex online schema changes.
We really wanted a model that could convincingly handle and reasonably schedule arbitrary combinations of schema change statements that are valid in Postgres. Unlike mysql postgres offers transactional schema changes. Unlike Postgres, cockroach strives to implement online schema changes in a protocol inspired by f1 [0]. Also, you want to make sure you can safely roll back (until you’ve reached the point where you know it can’t fail, then only metadata updates are allowed).
The model we came up with was to decompose all things that can possibly change into “elements” [1] and each element had a schedule of state transitions that move the element through a sequence of states from public to absent or vice versa [2]. Each state transitions has operations [3].
Anyway, you end up wanting to define rules that say that certain element states have to be entered before other if the elements are related in some way. Or perhaps some transitions should happen at the same time. To express these rules I created a little datalog-like framework I called rel [4]. This lets you embed in go a rules engine that then you can add indexes to so that you can have sufficiently efficient implementation and know that all your lookups are indexed statically. You write the rules in Go [5]. To be honest it could be more ergonomic.
The rules are written in Go but for testing and visibility they produce a datomic-inspired format [6]. There’s a lot of rules now!
The internal implementation isn’t too far off from the search implementation presented here [7]. Here’s unify [8]. The thing has some indexes and index selection for acceleration. It also has inverted indexes for set containment queries.
It was fun to make a little embedded logic language and to have had a reason to!
0: https://static.googleusercontent.com/media/research.google.c... 1: https://github.com/cockroachdb/cockroach/blob/f48b3438a296aa... 2: https://github.com/cockroachdb/cockroach/blob/f48b3438a296aa... 3: https://github.com/cockroachdb/cockroach/blob/f48b3438a296aa... 4: https://github.com/cockroachdb/cockroach/blob/f48b3438a296aa... 5: https://github.com/cockroachdb/cockroach/blob/f48b3438a296aa... 6: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql... 7: https://github.com/cockroachdb/cockroach/blob/f48b3438a296aa... 8: https://github.com/cockroachdb/cockroach/blob/f48b3438a296aa...
i tried writing a tiny logic engine once after reading the SICP chapter and yeah, the syntax part was easy to mimic but the actual backtracking logic hit me harder than expected. what helped was thinking of it less like solving and more like building a lazy search tree. once that clicked, i stopped trying to force control flow and just let the tree expand. one thing i didn’t see many mention handling stateful part or side effects blows up fast. even printing during backtracking messes the whole thing. most tutorials avoid that part completely
That's one of the reasons I tell people to avoid Prolog. Leave the side effects to the host language and embed either a miniKanren or Datalog engine.
yeah agree. keeping side effects in host lang makes life easier. tried forcing it in prolog once and instantly regretted. miniKanren with clean boundaries felt way more maintainable
>> The main issue is that Prolog is not truly declarative; how you write the rules has a major impact on how the interpreter behaves. You might end up with repeated answers for a query or it might enter an infinite loop. Prolog also allows IO, so the order in which things get executed is critical. It’s Turing-complete, for better or worse.
Yeah, sounds like a real show-stopper. Prolog is not 100% declarative, which is really what every programmer wants from a programming language.
Better use a language like C# or Java, or Python, which are 0% declarative. Or try to implement datalog (only not really) so that you can't even use lists because that's what real-word programmers are looking for in a real-world programming language: 100% declarative, 0% real-world uses.
Ah well, at least the author has heard of SLD-Resolution which is more than one can say about the authors of SICP. But they haven't heard of SLG Resolution with tabulation (a.k.a. "tabling") which is a way to execute Prolog programs while preserving the Turing-completeness but without getting stuck in left-recursions. Among other things.
https://www.swi-prolog.org/pldoc/man?section=tabling
Edit:
>> Prolog also allows IO, so the order in which things get executed is critical.
I paid €7 to make this post because I'm on a boat in the middle of the Adriatic sea and my Vodafone contract is AWOL, so listen closely: no, Prolog's IO is not what makes it Turing Complete. That comes from definite logic, i.e. a fragment of the First Order Predicate Calculus (a.k.a. First Order Logic) restricted to formulae in clausal form with at most one positive literal a.k.a Horn clauses, that nevertheless retains the full expressivity of full FOL. Definite logic is semi-decidable and when evaluated by Resolution, as SLD-Resolution (restricted to definite program clauses and Horn goals), it is refutation-complete (and sound, to boot). Meaning, in summary, that any true sentence can be proved true but if a sentence is false then it cannot always be proved false, by SLD-Resolution. So far, so theoretically good.
What Prolog does which is really dangerous and makes the order of execution "critical" is that it allows the programmer to edit the program database i.e. the database of facts and rules, in other words, the program itself, programmatically, i.e from the program itself. This means that not all your program's state is known before the program runs and database operations are completed. That sucks hairy balls but it is also a very useful mechanism that requires experience and understanding of the language, its core concepts, and programming in general, to use right.
But Prolog's Turing completeness has nothing to do with IO, or database ops. That was not worth €7.
Or more accurately, a super simple Datalog implementation.
Not really logic programming, but a while ago I made this, also in Python: https://github.com/ariroffe/logics (mainly for educational purposes). Parts of the implementation kind of reminded me of it.