MrWibble said:
My problem with what Sweeny is saying, is that he's addressing how to express a concept in a better way, but none of this really deals with how to translate that expression into something the computer can actually execute efficiently.
His example of qsort in C++ and Haskell illustrates this - yes, the Haskell version is much much neater, but the compiler will either have to be orders of magnitude smarter than what we have right now, or it'll be slower than the C++ version by the time it's compiled - both versions will have to be translated into the same language to actually execute on a CPU...
Besides which, the C++ version is only complicated looking because it's already been written in a reasonably optimal form - I could write a less efficient C++ version that also looks neat.
Here is the difference between a procedural language and a good declarative or functional language. In order to express an idea in a procedural language, like a sorted list, you are forced to write the implementation of that idea and then your implementation is used to enforce the idea. In a functional or declaritve language you should only be saying enough to express the idea and it is the job the compiler to find an optimal solution. For cannonical patterns like sorted lists, the optimal implementation for it, given the data representation, should already be baked into the brains of the compiler or interpreter. The ideal expression of a sorted list is as follows:
S is a list of some type A
<= is a comparison on elements of type A
S_x is the xth item in S
The following is the optimal expression of a sorted list.
Sorted(S) if and only if S_1 <= S_2 <= ... <= S_size(S)
The expression of a sorted list at the most abstract level is so uniform and frequently used that the actual algorithms for sorting should be baked into the compiler. All the compiler has to do now is 1) notice that what is being expressed as a constraint is that the list needs to be sorted and 2) perform the most optimal sort for the current data-representation of the list.
* If you are writing procedural code, you are telling the compiler how to solve the problem.
* If you are writing declarative code, you are telling the compiler what you would like done, but not how to do it.
Which method is better? In the first case you can solve your problem however you like using as optimal a solution as you like. For now it is necessary to tailor algorithms and their implementations to the quirks of embedded systems. This gives fast code (provided you're good at what you do. On the other hand you are taking your time to write solutions to problems which have been solved 1000s of times over by 1000s of different people across the globe. You're re-inventing the wheel and often it is not even to find the solution on a quirky hardware.
In the second case, you solve the problem once, formulate a concise expression for it and add the know-how of the solution to the compiler so that other people can begin to use your solution. What gets optimised then is the compiler and not the expression of the idea within the language. In declarative languages you need only invoke the idea. Some ideas are procedural in nature and their invocation will look procedural. Ideas which are not procedural will not look procedural. This gives massive amounts of freedom to the compiler to find an optimal solution. When you are writing a declarative problem in a procedural language you are being FORCED to pick 1 and only 1 method for solving the declarative problem in a procedural way.
Sorting is a declarative problem. The compiler should know that what is being expressed is the desire that the data be sorted and it is the compiler which should figure out the best way to do that given the data representation of the list. In declarative languages the actual data representation is hidden from the programmer. The programmer simply expresses his ideas as succinctly as possible.
Right now I'm of the mind that procedural languages should only be used in writing compilers and declarative languages should be used to express ideas about data manipulation.
MrWibble said:
The analogy of spoken languages is reasonable here - there have been attempts to design new languages that express things less ambiguously (than for example, English, which is really a horrible language in so many ways) - but AFAIK they've all failed to take off. I'm thinking that sometimes it's just better to adapt the language we all speak as the world changes, rather than try to get everyone in the whole world to learn something completely new. Sometimes we learn more than one language, other times we just splice bits of language together and grow our native tongues to encompass and express foreign ideas.
You're joking right? There is no such language as English. It doesn't even exist at an abstract level. We're all speaking with our own dialects and we can communicate in as much as they overlap. Our individual languages overlap most when we are talking about very clear domains and patterns of information. Medical doctors have their own domain specific language that they use. Business salesmen have their own domain specific language. Each sub-culture has its own domain specific language. The domain might use lots of words that you speak normally and the meanings of those words might not stray very far from what you would expect. But the fact that the words are used within a domain means there is a force which is anchoring the meaning of the words. When the words are spoken within the domain, it is the meaning of the domain that is invoked and not the meaning of your ordinary vocabulary. We intercommunicate best with each other when there is a clear domain to anchor the meaning of the words being used. A society will fracture if there are no domains to anchor language and meaning.
MrWibble said:
So perhaps there are elements of other languages that us C/C++ folks should learn and try to integrate into our "native" language, but I'm not sure we should adopt an entirely new way of writing code until someone can show me that it's actually a solution and not just a rephrasing of the problem.
From a theoretic standpoint, why code a C compiler by hand for each new processor? What you need is a meta-framework where the theory of finite computation is known and the definition of the processor will co-mingle with the definition of C and its known algorithms such that a compiler for C on that processor can be generated automatically. The meta-framework is a compiler compiler, and it saves a tremendous amount of recoding things by hand.
The cutting edge of technology is proof theory and automated proving systems. How long before these systems can find optimal implementations we've never thought about? 20 years? In 20 years you should be writing more declarative code than procedural code. The study of linguistics combined with the meta-theory of computation I think is going to revolutionize programming language theory. I'm hoping it produces some very nice tools in the future which will cut out procedural programming and will allow people to write declarative solutions behind which a compiler will break the problem down and farm sub-problems out to be solved by specialized problem solving engines, which they themselves might be multi-processor spanning.