MMS • Daniel Dominguez
Article originally posted on InfoQ. Visit InfoQ
The most recent compilation of advanced research, inventive applications, and notable unveilings in the realm of Large Language Models (LLMs) during the week starting September 4th, 2023.
PointLLM: Empowering Large Language Models to Understand Point Clouds
This paper introduces PointLLM, a novel approach aimed at enhancing Large Language Models’ (LLMs) understanding of 3D data, particularly point clouds. PointLLM processes colored object point clouds with human instructions, demonstrating its ability to grasp point cloud concepts and generate contextually relevant responses. Evaluation benchmarks, including Generative 3D Object Classification and 3D Object Captioning, show that PointLLM outperforms existing 2D baselines, with human evaluators finding it superior in over 50% of object captioning samples.
Codes, datasets, and benchmarks are available at https://github.com/OpenRobotLab/PointLLM
WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model
This paper explores the integration of Large Language Models (LLMs) with visual grounding and robotic grasping systems to enhance human-robot interaction, exemplified by the WALL-E (Embodied Robotic Waiter load lifting with Large Language model) system. WALL-E utilizes ChatGPT’s LLM to generate target instructions through interactive dialogue, which are then processed by a visual grounding system to estimate object pose and size, enabling the robot to grasp objects accordingly. Experimental results in various real-world scenarios demonstrate the feasibility and effectiveness of this integrated framework.
More information can be found on the project website https://star-uu-wang.github.io/WALL-E/
AskIt: Unified Programming Interface for Programming with Large Language Models
In this paper, authors discuss AskIt, a domain-specific language (DSL) designed to simplify the integration of Large Language Models (LLMs) in software development. AskIt offers type-guided output control, template-based function definitions, and a unified interface that bridges the gap between LLM-based code generation and application integration. It leverages Programming by Example (PBE) for few-shot learning at the programming language level, achieving significant prompt length reduction and improved speed in benchmark experiments. AskIt aims to streamline the efficient and versatile utilization of LLMs’ emergent abilities in software development, with implementations available in TypeScript and Python.
The implementations of AskIt in TypeScript and Python are available at https://github.com/katsumiok/ts-askit and https://github.com/katsumiok/pyaskit, respectively
Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models
This paper introduces Jais and Jais-chat, an Arabic-focused large language model (LLMs) with 13 billion parameters. These models outperform existing Arabic and multilingual models in Arabic knowledge and reasoning capabilities and remain competitive in English despite being trained on less English data. They provide a detailed account of their training, tuning, safety measures, and evaluations, and both the foundation Jais model and instruction-tuned Jais-chat variant are released to foster research in Arabic LLMs.
Accessible at https://huggingface.co/inception-mbzuai/jais-13b-chat