Software / Simulation Archives - The Robot Report https://www.therobotreport.com/category/software-simulation/ Robotics news, research and analysis Tue, 03 Dec 2024 17:00:33 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://www.therobotreport.com/wp-content/uploads/2017/08/cropped-robot-report-site-32x32.png Software / Simulation Archives - The Robot Report https://www.therobotreport.com/category/software-simulation/ 32 32 AWS offers accelerated robotics simulation with NVIDIA https://www.therobotreport.com/aws-offers-accelerated-robotics-simulation-nvidia/ https://www.therobotreport.com/aws-offers-accelerated-robotics-simulation-nvidia/#respond Tue, 03 Dec 2024 18:30:07 +0000 https://www.therobotreport.com/?p=581816 AWS and NVIDIA said that Isaac Sim on Amazon Web Services can significantly accelerate and scale robot simulation and AI training.

The post AWS offers accelerated robotics simulation with NVIDIA appeared first on The Robot Report.

]]>
AWS and Isaac Sim can help accelerate robotics development, says NVIDIA.

AWS and Isaac Sim can help accelerate robotics development, says NVIDIA.

NVIDIA Corp. today announced at AWS re:Invent enhanced tools for robotics developers, as well as the availability of NVIDIA DGX Cloud on Amazon Web Services and offerings for artificial intelligence and quantum computing.

The company said that NVIDIA Isaac Sim is now available on NVIDIA L40S graphics processing units (GPUs) in Amazon Elastic Cloud Computing (EC2) G6e instances. It said this could double scaling robotics simulation and accelerate AI model training. Isaac Sim is a reference application built on NVIDIA Omniverse for developers to simulate and test AI-driven robots in physically based virtual environments.

With NVIDIA OSMO, a cloud-native orchestration platform, developers can easily manage their complex robotics workflows across their AWS computing infrastructure, claimed the company.

“This combination of NVIDIA-accelerated hardware and software — available on the cloud — allows teams of any size to scale their physical AI workflows,” wrote Akhil Docca, senior product marketing manager for Omniverse at NVIDIA.


SITE AD for the 2025 Robotics Summit registration. Register now


What is ‘physical AI?’

According to NVIDIA, “physical AI” describes AI models that can understand and interact with the physical world. The company said it “embodies the next wave of autonomous machines,” such as self-driving cars, industrial manipulators, mobile robots, humanoids, and even robot-run infrastructure like factories and warehouses.

With physical AI, developers are embracing a “three-computer solution” for training, simulation, and inference to make breakthroughs, NVIDIA said. Yet physical AI for robotics systems requires robust training datasets to achieve precision inference in deployment. Developing such datasets and testing them in real situations can be impractical and costly.

Simulation offers an answer, as it can accelerate the training, testing and deployment of AI-driven robots, the company asserted.

L40S GPUs in the cloud offer to scale simulation, training

Developers can use simulation to verify, validate, and optimize robot designs as well as the systems and their algorithms before deployment, said NVIDIA. It added that simulation can optimize facility and system designs before construction or remodeling starts for maximum efficiencies, reducing costly manufacturing change orders.

Amazon EC2 G6e instances accelerated by NVIDIA L40S GPUs can double performance over the prior architecture, while allowing the flexibility to scale as scene and simulation complexity grows, NVIDIA said. Roboticists can use these instances to train many computer vision models that power AI-driven robots.

This means the same instances can be extended for various tasks, from data generation and simulation to model training. NVIDIA added that OSMO allows teams to orchestrate and scale complex robotics development workflows across distributed computing resources, whether on premises or in the AWS cloud.

NVIDIA said Isaac Sim can foster collaboration and critical workflows, such as generating synthetic data for perception model training.

A reference workflow combines NVIDIA Omniverse Replicator, a framework for building custom synthetic data generation (SDG) pipelines and a core extension of Isaac Sim, with NVIDIA NIM microservices. With it, developers can build generative AI-enabled SDG pipelines, it said.

These include the USD Code NIM microservice for generating Python USD code and answering OpenUSD queries, plus the USD Search NIM microservice for exploring OpenUSD assets using natural language or image inputs.

The Edify 360 HDRi NIM microservice can generate 360-degree environment maps, while the Edify 3D NIM microservice can create ready-to-edit 3D assets from text or image prompts. Generative AI can thus ease the synthetic data generation process by reducing many tedious and manual steps, from asset creation to image augmentation, said NVIDIA.

  • Rendered.ai’s synthetic data engineering platform is integrated with Omniverse Replicator. It enables companies to generate synthetic data for computer vision models used in industries from security and intelligence to manufacturing and agriculture.
  • SoftServe Inc., an IT consulting and digital services provider, uses Isaac Sim to generate synthetic data and validate robots used in vertical farming with Pfeifer & Langen, a leading European food producer.
  • Tata Consultancy Services is building custom synthetic data generation pipelines to power its Mobility AI suite to address automotive and autonomous use cases by simulating real-world scenarios. Its applications include defect detection, end-of-line quality inspection, and hazard avoidance.

NVIDIA, AWS help robots learn in simulation

While Isaac Sim enables developers to test and validate robots in physically accurate simulation, Isaac Lab, an open-source robot learning framework built on Isaac Sim, provides a virtual playground for building robot policies that can run on AWS Batch. Because these simulations are repeatable, developers can troubleshoot and reduce the number of cycles required for validation and testing, said NVIDIA.

The company cited robotics startups that are already using Isaac Sim on AWS: 

  • Field AI is building robot foundation models to enable robots to autonomously manage a wide range of industrial processes. It uses Isaac Sim and Isaac Lab to evaluate the performance of these models in complex, unstructured environments in construction, manufacturing, oil and gas, mining, and more.
  • Vention, which offers a full-stack cloud-based automation platform, is creating pretrained skills to ease development of robotic tasks, noted NVIDIA. It is using Isaac Sim to develop and test new capabilities for robot cells used by small to midsize manufacturers.
  • Cobot offers Proxie, its AI-powered collaborative mobile manipulator. It uses Isaac Sim to enable the robot to adapt to dynamic environments, work alongside people, and streamline logistics in warehouses, hospitals, airports, and more.
  • Standard Bots is simulating and validating the performance of its R01 robot used in manufacturing and machining setup.
  • Swiss-Mile is using Isaac Sim and Isaac Lab for robot learning so that its wheeled quadruped robots can perform tasks autonomously with new levels of efficiency in factories and warehouses.
  • Cohesive Robotics has integrated Isaac Sim into its software framework called Argus OS for developing and deploying robotic workcells used in high-mix manufacturing environments.
  • Aescape’s robots are able to provide precision-tailored massages by accurately modeling and tuning the onboard sensors in Isaac Sim.

NVIDIA made other announcements in addition to the availability of Isaac Sim 4.2 on Amazon EC2 G6e Instances powered by NVIDIA L40S GPUs on AWS Marketplace.

It said that NVIDIA DGX Cloud can run on AWS for training AI models; that AWS liquid cooling is available for data centers using its Blackwell platform; and that NVIDIA BioNeMo NIM microservices and AI Blueprints, developed to advance drug discovery, are now integrated into AWS HealthOmics.

The company also said its latest AI Blueprints are available on AWS for video search and cybersecurity, the integration of NVIDIA CUDA-Q with Amazon Braket for quantum computing development, and RAPIDS Quick Start Notebooks on Amazon EMR.

The post AWS offers accelerated robotics simulation with NVIDIA appeared first on The Robot Report.

]]>
https://www.therobotreport.com/aws-offers-accelerated-robotics-simulation-nvidia/feed/ 0
Realtime Robotics appoints Ville Lehtonen vice president of product https://www.therobotreport.com/realtime-robotics-appoints-ville-lehtonen-vice-president-of-product/ https://www.therobotreport.com/realtime-robotics-appoints-ville-lehtonen-vice-president-of-product/#respond Sun, 01 Dec 2024 13:46:12 +0000 https://www.therobotreport.com/?p=581798 Realtime Robotics has named Ville Lehtonen, who previously worked at HighRes Biosolutions and Pickle Robot, to lead its product efforts.

The post Realtime Robotics appoints Ville Lehtonen vice president of product appeared first on The Robot Report.

]]>
Optimization solution evaluates multiple paths, sequences, poses, end-of-arm-tool rotations, and interlocks for multiple robots within a cell, says Realtime Robotics.

Optimization evaluates multiple paths, sequences, poses, end-of-arm tool rotations, and interlocks for robots within a workcell. Source: Realtime Robotics.

Realtime Robotics, a leader in collision-free autonomous motion planning for industrial robots, last week named industry veteran Ville Lehtonen as its vice president of product.

Lehtonen brings experience in technology, product, and management, said Realtime Robotics. He most recently served as head of product at Pickle Robot Co., which he guided to a leadership position in the truck and container loading and unloading industry.

“Ville’s track record speaks for itself, and we’re confident he will be an excellent addition to the team,” said Kevin Carlin, chief commercial officer at Realtime Robotics.

“Our Optimization solution is already helping several manufacturing companies to reduce cycle times and improve productivity,” Carlin stated. “With Ville’s expertise, we can evolve to meet additional customer needs and expand its adoption throughout the manufacturing and logistics industries.”


SITE AD for the 2025 Robotics Summit registration. Register now


Lehtonen expects ‘a massive gear change’

Prior to Pickle, Lehtonen was head of product for HighRes Biosolutions, a laboratory automation software company, and he was a co-founder and CEO of LabMinds Ltd., a laboratory automation company.

Lehtonen holds a BS and an MS in computer science from the Helsinki University of Technology and an MBA from Oxford University.

Ville Lehtonen

Ville Lehtonen. Source: LinkedIn

“I look forward to helping already highly automated production lines become even more efficient and cost-effective with the use of Realtime’s Optimization technology,” he said. “I am confident we can help manufacturers save tens of thousands of hours on their industrial robotics projects.”

“What Realtime is doing is a massive gear change in deploying automation,” Lehtonen added. “While this will be incredibly helpful for current manufacturers, the most exciting opportunities come from unlocking the economics for companies operating on a far smaller scale than the heavy users of robots. Realtime’s technology stack also can do for kinematics what real-time object-detection frameworks like YOLO [You Only Look Once] have done for computer vision, further lowering the barriers to entry in the robotics space.”

About Realtime Robotics

Boston-based Realtime Robotics said its technology generates optimized motion plans and interlocks to achieve the shortest possible cycle time in single and multi-robot workcells. The company claimed that its systems expand the potential of automation, empowering multiple robots to work closely together in unstructured and collaborative workspaces, reacting to dynamic obstacles the instant changes are perceived.

Realtime said its Optimization product uses a combination of proprietary software and experienced robotics and application engineering insights to drastically improve a manufacturer’s overall productivity. The system analyzes a customer’s existing digital twin, identifying bottleneck areas and recommending improvements based on desired parameters. 

Optimization can do all of this without interfering with ongoing production efforts, said Realtime Robotics.

The post Realtime Robotics appoints Ville Lehtonen vice president of product appeared first on The Robot Report.

]]>
https://www.therobotreport.com/realtime-robotics-appoints-ville-lehtonen-vice-president-of-product/feed/ 0
Learn about digitalization in the warehouse in new webinar https://www.therobotreport.com/learn-about-digitalization-in-the-warehouse-in-webinar/ https://www.therobotreport.com/learn-about-digitalization-in-the-warehouse-in-webinar/#comments Wed, 27 Nov 2024 14:30:49 +0000 https://www.therobotreport.com/?p=581774 Digitalization of the warehouse involves several emerging technologies; attendees of this free webinar can learn from industry experts.

The post Learn about digitalization in the warehouse in new webinar appeared first on The Robot Report.

]]>
Digital tools such as the simulation shown here from Dexory, are part of digitalization in the warehouse.

Digitalization is bringing emerging technologies into the warehouse. Source: Dexory

Designing and deploying a digital warehouse can be a challenge, with numerous technology options to add to your operations. From robotics and automation to the latest data analytics and artificial intelligence, how can you take advantage of digitalization?

At 2:00 p.m. EST on Wednesday, Dec. 4, expert panelists will discuss how emerging technologies are changing how engineers design warehouse systems and how businesses can gain insights and efficiencies with them. Sensors, digital twins, wearables, and virtual assistants are some of the tools that are part of this digital transformation.

In this free webinar, viewers can learn about:

  • Ways to improve labor productivity with workforce management
  • The orchestration of people and autonomous mobile robots (AMRs) for order picking and fulfillment
  • Where augmented and virtual reality (AR/VR) fit in the warehouse
  • How AI will change how operators use data in positive feedback cycle
  • How to scale digital transformation across facilities and the supply chain

Register now to attend this webinar on digitalization, and have your questions answered live. Registrants will be able to view it on demand after the broadcast date.

Digitalization speakers to share insights

Robert C. Kennedy, principal at RC Kennedy Consulting, will discuss digitalization in the warehouse.

Robert C. Kennedy is principal at RC Kennedy Consulting. For over four decades, he has planned, developed, and implemented industry-leading supply chain execution systems around the globe. Kennedy and his staff have led more than 200 large-scale implementation projects of supply chain execution software for leading customers in a variety of industries, including pharmaceutical, electronics, third-party logistics (3PL), and food and beverage.

As a leading voice of expertise, Bob is featured in regular interviews by industry media and has published articles, and he has presented at numerous trade shows and seminars.

RC Kennedy Consulting provides assistance to companies to improve operational efficiencies through process design and systems. It also helps them develop strategies for growth.

Ken Ramoutar will discuss digitalization in the warehouse.

Ken Ramoutar is chief marketing officer at Lucas Systems, which helps companies transform their distribution center by dramatically increasing worker productivity, operational agility, and customer and worker satisfaction using voice and AI optimization technologies.

In his 25 years of customer centric roles in supply chain software and consulting, Ramoutar has navigated companies through uncertainty and volatility as a thought leader and change agent.

Prior to Lucas, Ken was senior vice president and global head of customer experience at Avanade, a $3 billion Accenture and Microsoft-owned company, and he has held leadership roles at IBM, Sterling Commerce, and SAP/Ariba.

Michael Taylor is chief product officer and co-founder of Duality AI.

Michael Taylor is the chief product officer and co-founder of Duality AI. He has a 20-year career in mobile robotics, with 15 years dedicated to building autonomous field robots at Caterpillar.

While there, Mike led the team developing the autonomy system for Caterpillar’s autonomous dozer, and he helped launch the Autonomous Mining Truck program. His roles included architecting behaviors and planning systems, as well as building a collection of simulation technologies to accelerate deployment to customer sites.

Taylor was also part of the Carnegie Mellon team that won DARPA’s Urban Challenge, where he led both the Controls Team and the Field Calibration Team. Taylor holds dozens of patents in fields ranging from robotics to simulation technologies.

At Duality AI, Taylor leads the company’s Product and Solutions Engineering team. He is responsible for steering Duality’s product strategy, developing technologies to address customer needs, and helping ensure that customers maximize the value they extract from Falcon. This includes projects ranging from a simulation solution to support a drone-based AI perception system, to generating synthetic data for high-volume manufacturing quality assurance, to characterizing and modeling of uncrewed ground vehicles (UGVs) navigating novel environments. 

Eugene Demaitre, editorial director for robotics at WTWH Media

Eugene Demaitre, moderator, is the editorial director for robotics at WTWH Media, which produces Automated WarehouseThe Robot Report, the Robotics Summit & Expo, and RoboBusiness. Prior to working for WTWH Media, he was an editor at BNA (now part of Bloomberg), Computerworld, TechTarget, Robotics Business Review, and Robotics 24/7.

Demaitre has participated in conferences worldwide, as well as spoken on numerous webcasts and podcasts. He is always interested in learning more about robotics. He has a master’s from the George Washington University and lives in the Boston area.

This webinar is sponsored by Baluff and Dexory.

Balluff logo
Dexory logo

The post Learn about digitalization in the warehouse in new webinar appeared first on The Robot Report.

]]>
https://www.therobotreport.com/learn-about-digitalization-in-the-warehouse-in-webinar/feed/ 1
Duality AI offers developers EDU license for Falcon digital twins, synthetic data https://www.therobotreport.com/duality-ai-offers-developers-edu-license-for-falcon-digital-twins-synthetic-data/ https://www.therobotreport.com/duality-ai-offers-developers-edu-license-for-falcon-digital-twins-synthetic-data/#respond Thu, 21 Nov 2024 13:54:52 +0000 https://www.therobotreport.com/?p=581670 The EDU program offers subscribers full access to Falcon’s comprehensive feature set, alongside community resources developed by Duality AI.

The post Duality AI offers developers EDU license for Falcon digital twins, synthetic data appeared first on The Robot Report.

]]>
Scenarios in Duality AI's Falcon Editor, including an electrical tower, an automated guided vehicle, an autonomous mobile robot, and a humanoid robot.

The Falcon digital twin platform provides high-fidelity, domain-tailored simulation for a variety of use cases. | Source: Duality AI

Duality AI yesterday launched an EDU license and subscription for its Falcon simulation platform. The company said it designed this new program to equip aspiring artificial intelligence developers with the synthetic data skills needed to create advanced AI vision models.

This educational, non-commercial license is intended to expand access to digital twin simulation, said Duality. The San Mateo, Calif.-based company said it will enable students and developers to build cutting-edge AI models and meet the growing demand for AI professionals across industries.

“Digital twin simulation has unlocked a future where anyone can build AI models safely, rapidly, and affordably,” said Mike Taylor co-founder and chief product officer of Duality AI. “Now is the perfect time to invest in building a community that can harness these tools.”

“Whether learners come from an engineering, research, or creative background, we’re excited to share our expertise and help them discover how their skills can play a vital role in the evolving AI industry,” he stated.

Falcon generates accurate data for modeling, training

Founded in 2018, Duality AI said its multidisciplinary team includes engineers, simulation specialists, AI and machine learning experts, and technical artists. They have more than over 70 patents across robotics, simulation, and visualization.

The company specializes in cases where real-world data is insufficient for achieving the precision required for AI modeling and training of complex operations. Duality said it has developed proven techniques that drive successful outcomes for its customers. 

By bringing high-fidelity digital twins of environments and operating systems into Falcon, organizations can generate accurate data and predictive behavior modeling, said Duality AI. This enables them to deploy automated systems robustly and at scale, the company claimed.

Organizations are using the Falcon platform to help solve problems in AI, robotics, and smart system engineering, said the company. Their applications span off-road autonomous driving, high-volume manufacturing, warehouse automation, and disaster management.

Duality AI told The Robot Report that it is taking a similar approach with the EDU license to its work with NASA’s Jet Propulsion Laboratory on the DARPA RACER, enabling students to generate synthetic data for outdoor environments and train and test AI models for autonomous off-road vehicles.

Duality AI to extend its expertise to students

As the need for accurate AI vision models continues to grow, so does the need for skills in digital twin simulation and synthetic data generation, said Duality AI.

“There is currently a lack of some key skills — such as creating digital twins or best-practice techniques for getting the most out of synthetic data — that are not that difficult to learn, but make a huge difference,” said a Duality AI spokesman. “We’re helping close that gap.”

The EDU program offers subscribers full access to Falcon’s feature set. It also includes guided exercises and community resources developed by Duality AI’s experts.

“As an example: In Exercise 1 of the program, we are showing roboticists another way to develop the object-detection models that run on their systems,” the spokesman said. “In fact, it’s a method that many in our field don’t think is possible. We went to show them that not only is it possible, but [also] that we can teach them how to bring these skills into their own development patterns.”

To further support all learners, Duality is launching an online community where anyone can ask questions, collaborate on projects, and share their work.

The company said the curriculum itself is designed to build a strong foundation in digital twin and synthetic data workflows, equipping participants with the skills to create high-performance AI vision models independently.

“Falcon is the platform I wish I had as a graduate student,” said Dr. Felipe Mejia, an AI vision engineer at Duality. “I was always searching for datasets to test new algorithms, and working with digital twins in Falcon offers endless opportunities to experiment and explore.”

“It allows me to simulate scenarios not well-covered by real data, and easily investigate model failure modes — like how does object detection success rate change based on obstruction, distance, lighting? Or any other variable,” he noted.

Duality AI added that its EDU subscription is intended to inspire innovation, and it encouraged users to experiment, develop their projects, and apply their learnings across a variety of fields. The company said it “hopes to foster a vibrant community of innovators eager to explore the full potential of synthetic data and digital twin simulation in modern AI applications.”


SITE AD for the 2025 Robotics Summit registration. Register now


The post Duality AI offers developers EDU license for Falcon digital twins, synthetic data appeared first on The Robot Report.

]]>
https://www.therobotreport.com/duality-ai-offers-developers-edu-license-for-falcon-digital-twins-synthetic-data/feed/ 0
Flexiv releases Elements Series 3 to simplify robot simulation, programming https://www.therobotreport.com/flexiv-releases-elements-series-3-simplify-robot-simulation-programming/ https://www.therobotreport.com/flexiv-releases-elements-series-3-simplify-robot-simulation-programming/#respond Wed, 20 Nov 2024 13:58:51 +0000 https://www.therobotreport.com/?p=581657 Flexiv has released Elements Series 3, which includes a simplified user interface, a rugged teach pendant, and support for multiple robot axes.

The post Flexiv releases Elements Series 3 to simplify robot simulation, programming appeared first on The Robot Report.

]]>

The path toward general-purpose robots is being paved by software to accelerate application development. Flexiv last week launched Elements Series 3, the latest version of its adaptive robot control system to makes programming faster and easier.

The Santa Clara, Calif.-based company said its engineers have simplified the user experience, focusing on human-centered design and semi-automated features.

Founded in 2016, Flexiv said it is dedicated to developing and manufacturing adaptive robots. The company said it has integrated industrial-grade force control, computer vision, and artificial intelligence to deliver “turn-key automation” that can enhance efficiency while reducing operational costs and environmental impact.


SITE AD for the 2025 Robotics Summit registration. Register now


Teach pendant plus simulation

Flexiv asserted that its new, ruggedized Teach Pendant and intuitive software allows programmers of any skill level to easily create and manage robotics applications, whether they’re in an office or on the production-line floor.

In addition, the Elements Studio 3D simulation tool allows users to design, test, and refine their applications before deploying them in the real world.

Fully compatible across PCs, the Teach Pendant, and all Flexiv robots, Elements Studio can reduce deployment time and minimize risk by allowing thorough testing in a virtual setting, claimed the company.

Flexiv says its new teach pendant and software are compatible with all PCs.

Source: Flexiv

Flexiv redesigns Motion Bar

As part of the newly released Elements hardware, the Motion Bar has also been redesigned. Based on user feedback, it now includes a status-indicator light and dedicated buttons for mode switching, Freedrive, and Jogging.

Flexiv said operators can use the Motion Bar independently or docked to the Teach Pendant for convenient robot control.

When coupled with the ability to build applications by physically moving a robot into position in Freedrive mode, the need for complex and time-consuming programming is removed. This hands-on approach means anyone can quickly and efficiently build, test, and perfect their application.

Flexiv has redesigned its motion bar, shown here.

Source: Flexiv

Elements 3 supports more motion

Flexiv said it has enabled support for multiple external axes, bringing users seven-plus degrees of freedom (7+N DoF) motion control, making the software suitable for complex tasks involving dual-axis rotary platforms or linear guide rails.

Elements 3 also features additional enhanced drag-and-drop function blocks — known as primitives—to accelerate programming and application building. Flexiv recently helped a seafood producer develop a fish fillet-shaping application with its Rizon 4 collaborative robots.

The company said its updated hardware and software are fully compatible with all of its robots, including the newly released Moonlight Adaptive Parallel Robot.

Elements 3 is compatible with all Flexiv robots, shown here.

Source: Flexiv

The post Flexiv releases Elements Series 3 to simplify robot simulation, programming appeared first on The Robot Report.

]]>
https://www.therobotreport.com/flexiv-releases-elements-series-3-simplify-robot-simulation-programming/feed/ 0
MIT: LucidSim training system helps robots close Sim2Real gap https://www.therobotreport.com/mit-lucidsim-training-system-helps-robots-close-sim2real-gap/ https://www.therobotreport.com/mit-lucidsim-training-system-helps-robots-close-sim2real-gap/#respond Sun, 17 Nov 2024 15:00:17 +0000 https://www.therobotreport.com/?p=581620 LucidSim uses generative AI and physics simulators to create realistic virtual training environments that help robots learns tasks without any real-world data.

The post MIT: LucidSim training system helps robots close Sim2Real gap appeared first on The Robot Report.

]]>

For roboticists, one challenge towers above all others: generalization – the ability to create machines that can adapt to any environment or condition. Since the 1970s, the field has evolved from writing sophisticated programs to using deep learning, teaching robots to learn directly from human behavior. But a critical bottleneck remains: data quality. To improve, robots need to encounter scenarios that push the boundaries of their capabilities, operating at the edge of their mastery. This process traditionally requires human oversight, with operators carefully challenging robots to expand their abilities. As robots become more sophisticated, this hands-on approach hits a scaling problem: the demand for high-quality training data far outpaces humans’ ability to provide it.

A team of MIT CSAIL researchers have developed an approach to robot training that could significantly accelerate the deployment of adaptable, intelligent machines in real-world environments. The new system, called “LucidSim,” uses recent advances in generative AI and physics simulators to create diverse and realistic virtual training environments, helping robots achieve expert-level performance in difficult tasks without any real-world data.

LucidSim combines physics simulation with generative AI models, addressing one of the most persistent challenges in robotics: transferring skills learned in simulation to the real world.

“A fundamental challenge in robot learning has long been the ‘sim-to-real gap’ – the disparity between simulated training environments and the complex, unpredictable real world,” said MIT CSAIL postdoctoral associate Ge Yang, a lead researcher on LucidSim. “Previous approaches often relied on depth sensors, which simplified the problem but missed crucial real-world complexities.”

The multi-pronged system is a blend of different technologies. At its core, LucidSim uses large language models to generate various structured descriptions of environments. These descriptions are then transformed into images using generative models. To ensure that these images reflect real-world physics, an underlying physics simulator is used to guide the generation process.

Related: How Agility Robotics closed the Sim2Real gap for Digit

Birth of an idea: from burritos to breakthroughs

The inspiration for LucidSim came from an unexpected place: a conversation outside Beantown Taqueria in Cambridge, MA.

​​”We wanted to teach vision-equipped robots how to improve using human feedback. But then, we realized we didn’t have a pure vision-based policy to begin with,” said Alan Yu, an undergraduate student at MIT and co-lead on LucidSim. “We kept talking about it as we walked down the street, and then we stopped outside the taqueria for about half an hour. That’s where we had our moment.”


SITE AD for the 2025 Robotics Summit registration. Register now


To cook up their data, the team generated realistic images by extracting depth maps, which provide geometric information, and semantic masks, which label different parts of an image, from the simulated scene. They quickly realized, however, that with tight control on the composition of the image content, the model would produce similar images that weren’t different from each other using the same prompt. So, they devised a way to source diverse text prompts from ChatGPT.

This approach, however, only resulted in a single image. To make short, coherent videos which serve as little “experiences” for the robot, the scientists hacked together some image magic into another novel technique the team created, called “Dreams In Motion (DIM).” The system computes the movements of each pixel between frames, to warp a single generated image into a short, multi-frame video. Dreams In Motion does this by considering the 3D geometry of the scene and the relative changes in the robot’s perspective.

“We outperform domain randomization, a method developed in 2017 that applies random colors and patterns to objects in the environment, which is still considered the go-to method these days,” says Yu. “While this technique generates diverse data, it lacks realism. LucidSim addresses both diversity and realism problems. It’s exciting that even without seeing the real world during training, the robot can recognize and navigate obstacles in real environments.”

The team is particularly excited about the potential of applying LucidSim to domains outside quadruped locomotion and parkour, their main testbed. One example is mobile manipulation, where a mobile robot is tasked to handle objects in an open area, and also, color perception is critical.

“Today, these robots still learn from real-world demonstrations,” said Yang. “Although collecting demonstrations is easy, scaling a real-world robot teleoperation setup to thousands of skills is challenging because a human has to physically set up each scene. We hope to make this easier, thus qualitatively more scalable, by moving data collection into a virtual environment.”

a quadruped robot learned to navigate new environments using generative ai.

MIT researchers used a Unitree Robotics Go1 quadruped. | Credit: MIT CSAIL

The team put LucidSim to the test against an alternative, where an expert teacher demonstrates the skill for the robot to learn from. The results were surprising: robots trained by the expert struggled, succeeding only 15 percent of the time – and even quadrupling the amount of expert training data barely moved the needle. But when robots collected their own training data through LucidSim, the story changed dramatically. Just doubling the dataset size catapulted success rates to 88%.

“And giving our robot more data monotonically improves its performance – eventually, the student becomes the expert,” said Yang.

“One of the main challenges in sim-to-real transfer for robotics is achieving visual realism in simulated environments,” said Stanford University assistant professor of Electrical Engineering Shuran Song, who wasn’t involved in the research. “The LucidSim framework provides an elegant solution by using generative models to create diverse, highly realistic visual data for any simulation. This work could significantly accelerate the deployment of robots trained in virtual environments to real-world tasks.”

From the streets of Cambridge to the cutting edge of robotics research, LucidSim is paving the way toward a new generation of intelligent, adaptable machines – ones that learn to navigate our complex world without ever setting foot in it.

Yu and Yang wrote the paper with four fellow CSAIL affiliates: mechanical engineering postdoc Ran Choi; undergraduate researcher Yajvan Ravan; John Leonard, Samuel C. Collins Professor of Mechanical and Ocean Engineering in the MIT Department of Mechanical Engineering; and MIT Associate Professor Phillip Isola.

Editor’s Note: This article was republished from MIT CSAIL

The post MIT: LucidSim training system helps robots close Sim2Real gap appeared first on The Robot Report.

]]>
https://www.therobotreport.com/mit-lucidsim-training-system-helps-robots-close-sim2real-gap/feed/ 0
The AI Institute introduces Theia vision foundation model to improve robot learning https://www.therobotreport.com/theia-vision-foundation-model-aiinstitute-generates-improve-robot-learning/ https://www.therobotreport.com/theia-vision-foundation-model-aiinstitute-generates-improve-robot-learning/#respond Wed, 13 Nov 2024 20:02:38 +0000 https://www.therobotreport.com/?p=581579 Theia is a visual foundation model that the AI Institute says can distill diverse models for policy learning at a lower computation cost.

The post The AI Institute introduces Theia vision foundation model to improve robot learning appeared first on The Robot Report.

]]>
 

In the field of robotics, vision-based learning systems are a promising strategy for enabling machines to interpret and interact with their environment, said the AI Institute today. It introduced the Theia vision foundation model to facilitate robot training.

Vision-based learning systems must provide robust representations of the world, allowing robots to understand and respond to their surroundings, said the AI Institute. Traditional approaches typically focus on single-task models—such as classification, segmentation, or object detection—which individually do not encapsulate the diverse understanding of a scene required for robot learning.

This shortcoming highlights the need for a more holistic solution capable of interpreting a broad spectrum of visual cues efficiently, said the Cambridge, Mass.-based institute, which is developing Theia to address this gap.

In a paper published in the Conference on Robot Learning (CoRL), the AI Institute introduced Theia, a model that is designed to distill the expertise of multiple off-the-shelf vision foundation models (VFMs) into a single model. By combining the strengths of multiple different VFMs, each trained for a specific visual task, Theia generates a richer, unified visual representation that can be used to improve robot learning performance.

Robot policies trained using Theia’s encoder achieved a higher average task success rate of 80.97% when evaluated against 12 robot simulation tasks, a statistically significant improvement over other representation choices.

Furthermore, in real robot experiments, where the institute used behavior cloning to learn robot policies across four multi-step tasks, the trained policy success rate using Theia was on average 15 percentage points higher than policies trained using the next-best representation.

The AI Institute plots robot control policies trained with Theia outperform policies trained with alternative representations on MuJoCo robot simulation tasks, with much less computation, measured by the number of Multiply-Accumulate operations in billions.

Robot control policies trained with Theia outperform policies trained with alternative representations on MuJoCo robot simulation tasks, with much less computation, measured by the number of Multiply-Accumulate operations in billions (MACs). Source: The AI Institute

Theia designed to combine visual models

Theia’s design is based on a distillation process that integrates the strengths of multiple VFMs such as CLIP (vision language), DINOv2 (dense visual correspondence), and ViT (classification), among others. By carefully selecting and combining these models, Theia is able to produce robust visual representations that can improve downstream robot learning performance, said the AI Institute.

At its core, Theia consists of a visual encoder (backbone) and a set of feature translators, which work in tandem to incorporate the knowledge from multiple VFMs into a unified model. The visual encoder generates latent representations that capture diverse visual insights.

These representations are then processed by the feature translators, which refine them by comparing the output features against ground truth. This comparison serves as a supervisory signal, optimizing Theia’s latent representations to enhance their diversity and accuracy.

These optimized latent representations are subsequently used to fine-tune policy learning models, enabling robots to perform a wide range of tasks with greater accuracy.

Theia's design is based on a process that distills the strengths of multiple VFMs, including CLIP, SAM, DINOv2, Depth-Anything, and ViT, among others, according to the AI Institute.

Theia’s design is based on a process that distills the strengths of multiple VFMs, including CLIP, SAM, DINOv2, Depth-Anything, and ViT, among others. Source: The AI Institute

Robots learn in the lab

Researchers at the AI Institute tested Theia in simulation and on a number of robot platforms, including Boston Dynamics‘ Spot and a WidowX robot arm. For one of the rounds of lab testing, it used Theia to train a policy enabling a robot to open a small microwave, place toy food inside, and close the microwave door.

Previously, researchers would have needed to combine all the VFMs, which is slow and computationally expensive, or select which VFM to use to represent the scene in front of the robot. For example, they could choose a segmentation image from a segmentation model, a depth image from a depth model, or a text class name from an image classification model. Each provided different types and granularity of information about the scene.

Generally, a single VFM might work well for a single task with known objects but might not be the right choice for other tasks or other robots.

With Theia, the same image from the robot can be fed through the encoder to generate a single representation with all the key information. That representation can then be input into Theia’s segmentation decoder to output a segmentation image. The same representation can be input into Theia’s depth decoder to output a depth image, and so on.

Each decoder uses the same representation as input because the shared representation possesses the information required to generate all the outputs from the original VFMs. This streamlines the training process and making actions transferable to a broader range of situations, said the researchers.

While it sounds easy for a person, the microwaving task represents a more complex behavior because it requires successful completion of multiple steps: picking up the object, placing it into the microwave, and closing the microwave door. The policy trained with Theia is among the top performers for each of these steps, comparable only to E-RADIO, another approach which also combines multiple VFMs, although not specifically for robotics applications.

Researchers used Theia to train a policy enabling a robot arm to microwave various types of toy food.

Researchers used Theia to train a policy enabling a robot arm to microwave various types of toy food. Source: The AI Institute

Theia prioritizes efficiency

One of Theia’s main advantages over other VFMs is its efficiency, said the AI Institute. Training Theia requires about 150 GPU hours on datasets like ImageNet, reducing the computational resources needed compared to other models.

This high efficiency does not come at the expense of performance, making Theia a practical choice for both research and application. With a smaller model size and reduced need for training data, Theia conserves computational resources during both the training and fine-tuning processes.

AI Institute sees transformation in robot learning

Theia enables robots to learn and adapt more quickly and effectively by refining knowledge from multiple vision models into compact representations for classification, segmentation, depth prediction, and other modalities.

While there is still much work to be done before reaching a 100% success rate on complex robotics tasks using Theia or other VFMs, Theia makes progress toward this goal while using less training data and fewer computational resources.

The AI Institute invited researchers and developers to explore Theia and further evaluate its capabilities to improve how robots learn and interpret their environments.

“We’re excited to see how Theia can contribute to both academic research and practical applications in robotics,” it said. Visit the AI Institute’s project page and demo page to learn more about Theia.


SITE AD for the 2025 Robotics Summit registration. Register now


The post The AI Institute introduces Theia vision foundation model to improve robot learning appeared first on The Robot Report.

]]>
https://www.therobotreport.com/theia-vision-foundation-model-aiinstitute-generates-improve-robot-learning/feed/ 0
Rockwell Automation adds NVIDIA Omniverse to digital twin software https://www.therobotreport.com/rockwell-automation-adds-nvidia-omniverse-to-digital-twin-software/ https://www.therobotreport.com/rockwell-automation-adds-nvidia-omniverse-to-digital-twin-software/#respond Tue, 12 Nov 2024 18:30:19 +0000 https://www.therobotreport.com/?p=581540 Rockwell said this joint system is valuable for consumer packaged goods, food and beverage, life sciences, and more industries.

The post Rockwell Automation adds NVIDIA Omniverse to digital twin software appeared first on The Robot Report.

]]>
A man facing away from the camera and towards a computer screen showing the Emulate3D Digital Twin Software.

With Rockwell Automation’s Emulate3D digital twin software, users can identify potential control issues preemptively, saving valuable time and resources during implementation. | Source: Rockwell Automation

Rockwell Automation is integrating NVIDIA Omniverse application programming interfaces (APIs) into its Emulate3D digital twin software. Rockwell Automation said this integration will enhance factory operations through artificial intelligence (AI) and physics-based simulation technology.

Rockwell Automation’s Emulate3D software uses NVIDIA Omniverse APIs to create factory-scale dynamic digital twins based on OpenUSD interoperability and NVIDIA RTX rendering technologies. While visualization was previously possible, this enhancement enables true emulation and dynamic testing of multiple machines within a system, Rockwell said. The integration is planned for early 2025 and will enable improved visualization and simulation capabilities for manufacturing environments.

“Our integration of Emulate3D with NVIDIA Omniverse marks a significant leap forward in bringing autonomous operations to life,” said Blake Moret, chairman and CEO of Rockwell Automation. “By combining our deep industrial expertise with NVIDIA’s cutting-edge technology, we’re helping our customers achieve new levels of efficiency, innovation, and collaboration in their manufacturing processes.”

Rockwell Automation is dedicated to industrial automation and digital transformation. The Milwaukee, Wis.-based company employs 27,000 people in more than 100 countries as of fiscal year-end 2024.

The company’s Emulate3D digital twin software helps users preemptively identify potential control issues, saving time and resources during implementation. Plant personnel receive additional support by having a virtual space to train on new systems, predict future performance, and simulate line changes without real-world consequences.

NVIDIA Omniverse enables digital twins at scale, Rockwell said

Digital twins enhance equipment development and control testing through simulation models and emulation, reducing startup time and risk. As equipment is connected into lines, models scale and challenges arise from siloed expertise and integration issues between separately engineered components.

A system-level perspective, including interoperability across machines, can solve these issues, but requires collaboration for system-level testing. As lines scale, larger digital twins require more computational power, risking bottlenecks. Automation leaders need scalable solutions to achieve full factory-scale models, building on digital twin successes.

By using NVIDIA Omniverse, Emulate3D will allow multiple dynamic digital twins to be combined and visualized as a complete factory through a web app. This vendor-agnostic, scalable approach aims to address the growing need for factory-scale digital twins created by engineers collaborating across various teams. Rockwell Automation’s expertise in industrial automation and Emulate3D’s comprehensive modeling capabilities pair with the NVIDIA Omniverse platform to enable real-time collaboration at scale.

Rockwell said manufacturers will benefit from:

  • Hyperscale capabilities through Emulate3D’s multi-model technology
  • Cloud-based deployment options for maximum flexibility
  • Vendor-agnostic connectivity to a wide range of 3D applications
  • A unified web app for stakeholder visualization

NVIDIA Omniverse lets developers integrate various factory layers into a comprehensive model, combining architectural software with industrial digital twins. This enables greater coordination across industrial design and operation. Built for scalability, Omniverse’s Universal Scene Description (OpenUSD) foundations and cloud deployments grow alongside projects. This means it helps meet customer demands for even the most complex endeavors.

Rockwell said this system is particularly valuable for industries with complex, hybrid applications. These include consumer packaged goods, food and beverage, life sciences, semiconductor manufacturing, automotive, and material handling.

The post Rockwell Automation adds NVIDIA Omniverse to digital twin software appeared first on The Robot Report.

]]>
https://www.therobotreport.com/rockwell-automation-adds-nvidia-omniverse-to-digital-twin-software/feed/ 0
How Agility Robotics crosses the Sim2Real gap with NVIDIA Isaac Lab https://www.therobotreport.com/how-agility-robotics-crosses-the-sim2real-gap-with-nvidia-isaac-lab/ https://www.therobotreport.com/how-agility-robotics-crosses-the-sim2real-gap-with-nvidia-isaac-lab/#respond Fri, 08 Nov 2024 22:13:27 +0000 https://www.therobotreport.com/?p=581478 Agility Robotics gives insight into how it trains its humanoid robot Digit in simulation using NVIDIA's robotics tools.

The post How Agility Robotics crosses the Sim2Real gap with NVIDIA Isaac Lab appeared first on The Robot Report.

]]>

At Agility Robotics, we are working with our humanoid robot Digit, which balances actively at all times, can recover from unexpected bumps, and can lift and move heavy things. There are 28 degrees of freedom and quite a few sensors. That’s a lot of information to take in, and a lot of actuators to coordinate, for complex actions that need to be decided in real time.

Every action or contact with the environment, like grasping a doorknob, can be felt at every joint of the robot and affects how the legs should balance the entire thing. 

So let’s start by talking about the two major approaches to controlling a dynamic robot like this:

  1. Using a model-based controller
  2. Using a learned controller

In many cases. the company uses model-based control and inverse dynamics to solve this problem. The model is a simplified physics model of Digit, running online on the computer inside the robot, calculating things like where to place a foot to balance the robot, based on measurements from the onboard inertial measurement unit and all of the position sensors.

We then use inverse dynamics, which uses a model of the humanoid’s link masses, inertias, actuators, and so on, to calculate the torques we need at each joint to get to the action we want from the model-based controller (i.e., how to get the foot to a certain place). This set of calculations must happen reliably in real time, hundreds or thousands of times per second.

But building up this kind of model and making the computation efficient can be quite challenging, since it requires us to enumerate all the details of physics that might be relevant, from the possible friction models of the terrain, to the possible types of errors in our joint sensors. It’s incredibly powerful, but it requires us to know a lot about the world.

Reinforcement learning (RL) is a totally different way of solving the same problem. Instead of using the robot and world models onboard in real-time to calculate the motor torques that we think will lead to the physics we want, we simulate worlds to learn a control policy for the robot ahead of time, on off-board computers that are much more powerful than the on-board robot computers.

We model the simulated robot in as much detail as possible, in a simulated world that has the expected obstacles and bumps, and try millions of possible sets of motor torque commands, and observe all of the possible responses. We use cost functions to judge whether a command achieves a more or less useful response

Over the course of significant simulation time — but only hours in real time — we learn a control policy that will achieve the goals we want, like walk around without falling, even if there’s an unexpected pothole in the ground. This means that we don’t need to exactly know what will happen in the real world; we just need to have found a controller that works well across a bunch of different worlds that are similar enough to the real one.

It trades off modeling effort for computational effort.  And it means that we can learn controllers that explore the limits of what might be physically possible with the robot, even if we don’t know exactly what those boundaries are ourselves.

An image of dozens of Agility Robots in simulation.

Training Digit in simulation through NVIDIA’s tools designed for robotics. | Source: NVIDIA

We’ve been able to demonstrate this in areas like step-recovery, where physics are particularly hard to model. In situations where Digit loses its footing, it’s often a result of an environment where we don’t have a good model of what’s going on – there might be something pushing on or caught on Digit, or its feet might be slipping on the ground in an unexpected way. Digit might not even be able to tell which issue it’s having.

But we can train a controller to be robust to many of these disturbances with reinforcement learning, training it on many possible ways that the robot might fall until it comes up with a controller that works well in many situations. In the following chart, you can see how big of a difference that training can make:

chart comparing model based controller with reinforcement learning

Comparing performance of model-based controller (left) against a controller trained with reinforcement-learning (right). | Source: Agility Robotics

Early last year, we started using NVIDIA Isaac Lab to train these types of models. Working with NVIDIA, we were able to make some basic policies that allowed Digit to walk around. But, to be honest, they started out with some weird behaviors.

One thing that we did get immediately, however, was the ability to run a lot more of our experiments. Moving to Isaac Lab and a GPU-accelerated environment was much faster than running simulations on the CPU. This enabled us to iterate much more quickly and start to identify the key area that we needed to improve:

Figuring out Agility Robotics’ Sim2Real gaps

In reinforcement learning, perhaps the biggest challenge is figuring out how to make a policy trained in a simulator transfer over to a real robot (hence the term “Sim2Real”). There are a lot of small differences between the real world and a simulated one, and even if you simulate a lot of worlds with a lot of variations, you might be missing some important component that always happens in the real world and never happens the same way in your simulations.

In our case, toe impacts are one such area. With every footstep, Digit impacts the ground with one of its toe plates. And the result of that impact is hard to predict.

Impacting the ground is already a very chaotic physical problem. (“Chaotic” in the formal sense, which is that very small deviations in the input can lead to unbounded variations in the output over time.)

Depending on exactly how your foot lands, you might slip, or have a lot of grip. You might be able to exert a lot of force, or only a little. And that small variation can lead to a big change in the outcome when you predict where the rest of your torso will end up.

This is exactly what happened with some of our earlier Isaac Lab policies. In simulation, the robot would walk confidently and robustly. But in the real world, it would slip and slide around wildly.

When you encounter a Sim2Real gap like this, there are two options. The easy option is to introduce a new reward, telling the robot not to do whatever bad thing it is doing. But the problem is that these rewards are a bit like duct tape on the robot — inelegant, missing the root causes. They pile up, and they cloud the original objective of the policy with many other terms. It leads to a policy that might work, but is not understandable, and behaves unpredictably when composed with new rewards.

The other, harder, option is to take a step back and figure out what it is about the simulations that differ from reality. Agility as a company has always been focused on understanding the physical intuition behind what we do. It’s how we designed our robot, all the way from the actuators to the software.

Our RL approach is no different. We want to understand the why and use that to drive the how. So we began a six-month journey to figure out why our simulated toes don’t do the same thing as our real toes.

It turns out there are a lot of reasons. There were simplifying assumption in the collision geometry, inaccuracies in how energy propagated through our actuators and transmissions, and instabilities in how constraints are solved in our unique closed-chain kinematics (formed by the connecting rods attached to our toe plates and tarsus). And we’ve been systematically studying, fixing, and eliminating these gaps.

The net result has been a huge step forward in our RL software stack. Instead of a pile of stacked-reward functions over everything from “Stop wiggling your foot” to “Stand up straighter,” we have a handful of rewards around things like energy consumption and symmetry that are not only simpler, but also follow our basic intuitions about how Digit should move.

Investing the time to understand why the simulation differed has taught us a lot more about why we want Digit to move a certain way in the first place. And most importantly, coupled with fast NVIDIA Isaac Sim, a reference application built on NVIDIA Omniverse for simulating an testing AI-driven robots, it’s enabled us to explore the impact of different physical characteristics that we might want in future generations of Digit.

A view of the underside of Agility Robotics' Digit's feet.

An example of a revised toe/foot concept (left), using four contact points, instead of the traditional shoe-style tread (right). | Source: Agility Robotics

We’ll be talking more about these topics at the 2024 Conference on Robot Learning (CoRL) next week in Munich, Germany. But the moral of the story is that understanding the dynamics of the world and our robot, and understanding the reasons for sources of noise and uncertainty rather than treating the symptoms, have let us use NVIDIA Isaac Lab to create simulations that are getting closer and closer to reality.

This enables simulated robot behaviors to transfer directly to the robot. And this helps us create simple, intuitive policies for controlling Digit that are more intelligent, more agile, and more robust in the real world.

Editor’s note: This article was syndicated, with permission, from Agility Robotics’ blog

Pras Velagapudi headshot.About the author

Pras Velagapudi is the chief technology officer at Agility Robotics. His specialties include industrial automation, robotic manipulation, multi-robot systems, mobile robots, human-robot interaction, distributed planning, and optimization. Prior to working at Agility, Velagapudi was the vice president and chief architect of mobile robotics at Berkshire Grey. 

The post How Agility Robotics crosses the Sim2Real gap with NVIDIA Isaac Lab appeared first on The Robot Report.

]]>
https://www.therobotreport.com/how-agility-robotics-crosses-the-sim2real-gap-with-nvidia-isaac-lab/feed/ 0
Fulcrum provides inspection data pipeline for Cantilever analysis, explains Gecko Robotics https://www.therobotreport.com/fulcrum-provides-inspection-data-pipeline-for-cantilever-analysis-explains-gecko-robotics/ https://www.therobotreport.com/fulcrum-provides-inspection-data-pipeline-for-cantilever-analysis-explains-gecko-robotics/#respond Fri, 08 Nov 2024 14:38:46 +0000 https://www.therobotreport.com/?p=581475 Gecko Robotics has developed Fulcrum, which uses AI to provide high-quality infrastructure data to its Cantilever analytics software.

The post Fulcrum provides inspection data pipeline for Cantilever analysis, explains Gecko Robotics appeared first on The Robot Report.

]]>
Screenshot of Gecko Robotics' Cantilever software analyzing data from a robotic tank inspection.

Fulcrum can ensure that Cantilever has high-quality infrastructure data to analyze. Source: Gecko Robotics

Robotic maintenance of large structures and critical infrastructure is only as useful as the data it yields. Gecko Robotics Inc. has announced its Fulcrum software for data acquisition and quality. Its first public use was this week.

The Pittsburgh-based company, best known for its robots that can climb and maintain tanks, has also developed drones and software. It said its Cantilever operating platform uses artificial intelligence and robotics (AIR) for data analysis and to support fast decision-making at scale.

Jake Loosararian, co-founder and CEO, and Jennifer Padgett, engineering manager at Gecko Robotics, explained to The Robot Report how Fulcrum and Cantilever can enable new levels of insights from robotic inspection.

Fulcrum enables data analytics from multiple sources

What is Fulcrum?

Loosararian: Jenn designed and built Fulcrum. Its design is centered around creating an API [application programming interface] for robots.

It’s all in support of our goal for Gecko — to protect critical infrastructure. This requires information about the built world.

Robots armed with different sensors turn the physical world of atoms into bits. The key is ensuring those bits drive useful outcomes.

The sensors on robots and drones can collect a lot of data — how do you determine what’s useful?

Loosararian: We collect so much and different types of information with our robots that climb walls or from fixed sensors. It’s not enough to just gather and post-process this data. We want to get as close to the process as possible.

Fulcrum is specifically built to gather data sets for high-fidelity foundation models. It’s designed not just to ensure quality data from all types of robots and sensors, but also to accelerate our ability to capture data layers for our Cantilever enterprise software.

For example, they can be used to predict when a tank would leak, a bridge collapse, or a naval vessel need to be modernized.

Padgett: We’re building a validation framework with our subject-matter expertise. We’ve collected millions of data points, while humans typically gather data points every square foot or two.

With Fulcrum, we understand the data as you’re collecting it and double-check it. We’ve optimized for inspections of concrete in missile silos, as well as tanks and boilers.

Gecko Robotics offers understanding of infrastructure health

How is Gecko Robotics pivoting from robotics hardware to AI?

Loosararian: We’ve traditionally developed hardware for data collection. Data quality is the starting point.

We’re helping people to understand what their livelihoods are based on by giving a full picture. Inspections affect everything from driving across a bridge to turning on the electricity.

We believe in democratizing data. We can’t build all the robots ourselves, and I recently talked onstage about the potential for humanoid robots like Tesla’s Optimus.

We’ve developed AI and built an ontology to connect things to monitor and maintain infrastructure health. Building and operating critical infrastructure is a matter of global competitiveness.

Padgett: With AI for pre-processing and low-level heuristics on key modules, Gecko can deliver useful data for customers. Fulcrum is really meant to provide higher-level analytics at the edge.

Jake, you mentioned the API and working with other robots. Who are you working with?

Loosararian: We’ve already made partnerships and are vetting a dozen companies for the kinds of tools that will be certified under the Gecko umbrella. We want to onboard as many robots as we can.

At the same time, we’re very skeptical of which robots are actually valuable. As we go to market with the platform, we understand which tools are good for marketing versus actually helping the business.

We’re not interested in research projects; we’re interested in companies that want specific, real-world impacts within 90 days. Right now, there’s a lot of skepticism around hardware and software, but with our robots and AI-powered software, the savings are real.

We’ve built up abstracts for how to interact with certain types of robots, drones, and marine systems. This makes it easy to add; by working them into our communications protocol, we’re language-agnostic.

We’re also interested in new types of sensors and how they can affect outcomes.


SITE AD for the 2025 Robotics Summit registration. Register now


Predictive maintenance key to value proposition

What industries can this help?

Loosararian: It’s not one industry; it’s everyone. Infrastructure is huge — from aircraft carriers to mining companies. We’ve got products and services that help them understand the state of their assets.

Right now, we’re focusing on built structures using next-generation IoT [Internet of Things] sensors. With fixed robots, mesh networks, and 5G, we’re imagining beyond that.

Cantilever is already providing data on 500,000 assets, and it’s already making changes in the way customers operate.

We’re constantly being pinged by companies that want us to integrate automated repairs and cleaning, which are important functions to maintaining safety and environmental sustainability.

We want to ensure that we can meet growing demand for things like shipyard maintenance with the growing scarcity of qualified people. Fulcrum has the ability to offer relevant information, changing the standard operating procedures from human-collected data.

So is the goal to apply IoT and AI to spot issues before they become problems?

Loosararian: We can know what the robot is doing, what it should be collecting, and get the analysis. With the life-extension AIR module, we can look at the data layers in concrete, carbon steel, and stainless steel to extend the useful life of critical infrastructure.

Fulcrum is also part of capex [capital expenditure] optimization. Users want to avoid replacing things, having downtime, or suffering from catastrophic failures. They need specific data rather than broad strokes so they don’t have to worry about overpaying to replace something that doesn’t yet need to be replaced.

Another opportunity is process optimization. For example, an oil company needs to understand how a higher sodium concentration in the Gulf of Mexico will impact its assets. That’s built into the Cantilever data pipeline from Fulcrum.

The post Fulcrum provides inspection data pipeline for Cantilever analysis, explains Gecko Robotics appeared first on The Robot Report.

]]>
https://www.therobotreport.com/fulcrum-provides-inspection-data-pipeline-for-cantilever-analysis-explains-gecko-robotics/feed/ 0
CynLr raises Series A funding to realize robot vision for ‘universal factory’ https://www.therobotreport.com/cynlr-raises-series-a-funding-to-realize-robot-vision-for-universal-factory/ https://www.therobotreport.com/cynlr-raises-series-a-funding-to-realize-robot-vision-for-universal-factory/#comments Wed, 06 Nov 2024 22:03:21 +0000 https://www.therobotreport.com/?p=581453 CynLr, which is developing technology to enable robots to manipulate unknown objects, will grow its team and expand its supply chain network.

The post CynLr raises Series A funding to realize robot vision for ‘universal factory’ appeared first on The Robot Report.

]]>
CynLr has designed CLX to provide human-level vision to machines.

The CLX robotic vision stack was inspired by human eyesight. Source: CynLr

CynLr, or Cybernetics Laboratory, today said it has raised $10 million in Series A funding. The company said it plans to use the investment to enhance its hardware reliability, improve its software performance and user experience, reduce costs for the customer, and expand its team.

Gokul NA and Nikhil Ramaswamy founded CynLr in 2019. The Bengaluru, India-based company specializes in “visual object sentience,” robotics, and cybernetics. It is developing technology to enable robots to manipulate objects of any shape, color, size, and form toward its “universal factory” or “factory-as-a-product” concept.

“This round of investments will help us focus on deeper R&D to build more complex applications and solutions for our customers, like Denso, where they need to manage their demand variability for different parts through a hot-swappable robot station,” stated Ramaswamy, founder and lead for go to market sales and investment at CynLr.

He also cited plant-level automation customers. “With General Motors … they require one standard robot platform to handle 22,000+ parts for assembly of the vehicles,” Ramaswamy said. 


SITE AD for the 2025 Robotics Summit registration. Register now


CynLr CLX-01 stack provides real-time, versatile vision

CynLr said its mission is to simplify automation and optimize manufacturing processes for universal factories. It was an exhibitor at the 2024 Robotics Summit & Expo in Boston.

The company claimed that it is building “the missing layers of fundamental technology” that will enable robots to intuitively recognize and manipulate even unknown objects just like a human baby might. CynLr said its “visual robot platform” enables robots to comprehend, grasp, and manipulate objects in complex and unpredicted environments. 

CyRo is a three-armed, modular, general-purpose dexterous robot. The company said it is its first product that can intuitively pick any object without training and can be quickly configured for complex manipulation tasks.

CyRo uses CynLr’s proprietary CLX-01 robotic vision stack, inspired by the human eye. Unlike traditional vision systems that rely only on pre-fed data for machine learning, CLX-01 uses real-time motion and convergence of its two lenses to dynamically see the depth of previously unknown objects.

CynLr added that its Event Imaging technology is agnostic to lighting variations, even for transparent and highly reflective objects. The company is partnering with multinational customers in the U.S. and EU to co-develop pilot applications.

“With the CyRo form factor receiving a resounding response from customers, technology-market fit has been firmly established,” said Gokul NA, co-founder and design, product, and brand leader at CynLr. “These customers are now eager to integrate CyRo into their production lines and experiment the transformational vision of a universal factory that can profitably produce custom-fit consumer goods, even at low volumes.”

CyRo from CynLr includes the CLX-01 perception system and robotic arms.

The CyRo modular robot includes three robotic arms for complex manipulation tasks. Source: CynLr

Investors support universal factory concept

Pavestone, Grow X Ventures, and Athera Ventures (formerly Inventus India) led CynLr’s Series A round, which brings its total funding over two rounds to $15.2 million. Existing investors Speciale Invest, Infoedge (Redstart), and others also participated.

“CynLr’s concept of a universal factory will simplify and eliminate the minimum order quantity bottleneck for manufacturing,” said Sridhar Rampalli, managing partner at Pavestone Capital. “Furthermore, the idea of changing automation by simply downloading task recipe from an online platform makes factories … product-agnostic. [They] can produce entirely new products out of same factory at a click of a button; it’s a future that we look forward to.” 

Vishesh Rajaram, managing partner at Speciale Invest, said: “Automating using a state-of-the-art industrial robot today costs 3x the price of a robot in customization, along with 24+ months of design modifications. This is the significant technological bottleneck that the team at CynLr is solving, paving the way for long-overdue evolution in automation. We are excited to be a part of their journey in building the factories of the future.”

“Enabling an industrial robot to perform seemingly simple tasks — like inserting a screw without slipping, for example — is what CynLr has managed to crack,” said Samir Kumar, general partner at Athera Venture Partners. “This breakthrough will enable the manufacturing industry to dramatically increase efficiency and maximize the value of production setups.”

From left, Gokul NA and Nikhil Ramaswamy, co-founders of CynLr.

From left, Gokul NA and Nikhil Ramaswamy, co-founders of CynLr. Source: CynLr

CynLr to expand staff, production, and ‘object store’

CynLr plans to expand its 60-member core team into a 120-member global team. In addition to expanding its software research and development team, the company said it will hire business and operational leaders, plus marketing and sales teams across India, the U.S., and Switzerland.

The 13,000-sq.-ft. (1,207.7 sq. m) robotics lab in Bengaluru currently hosts a “Cybernetics H.I.V.E.” of 25 robots, which CynLr plans to expand to more than 50 systems by 2026.

“CynLr manages an extensive supply chain of 400+ parts sourced across 14 countries and will expand its manufacturing capacity to achieve the goal of deploying one robot system per day and reach the $22 million revenue milestone by 2027,” said Gokul NA.

During Swiss-Indian Innovation Week in September, the company opened its Design & Research Center at the Unlimitrust Campus in Prilly, Switzlerland. The center will work closely with CynLr’s research partners in EPFL (the Swiss Federal Institute of Technology Lausanne) Learning Algorithms and Systems (LASA) Laboratory and the Swiss Center for Electronics and Microtechnology (CSEM) in Neuchâtel.

“With the current momentum of breakthroughs in CyRo’s capabilities, we will be able to substantially reduce costs and drive adoption, bringing it closer to realizing the possibility of creating an ‘object store’ — a platform similar to today’s app stores, allowing customers to pick a recipe of applications and object models to have the Robot instantaneously perform a desired task,” explained Ramaswamy. “The company will simultaneously invest in infrastructure for support, solutions engineering, and sales to support this larger vision.”

The post CynLr raises Series A funding to realize robot vision for ‘universal factory’ appeared first on The Robot Report.

]]>
https://www.therobotreport.com/cynlr-raises-series-a-funding-to-realize-robot-vision-for-universal-factory/feed/ 2
NVIDIA adds open AI and simulation tools for robot learning, humanoid development https://www.therobotreport.com/nvidia-adds-ai-simulation-tools-robot-learning-humanoid-development/ https://www.therobotreport.com/nvidia-adds-ai-simulation-tools-robot-learning-humanoid-development/#respond Wed, 06 Nov 2024 11:00:05 +0000 https://www.therobotreport.com/?p=581417 NVIDIA said its Project GR00T workflows and model tools, plus its Hugging Face partnership, will boost robot dexterity and mobility.

The post NVIDIA adds open AI and simulation tools for robot learning, humanoid development appeared first on The Robot Report.

]]>
New Project GR00T workflows and AI world model development technologies to accelerate robot dexterity, control, manipulation and mobility.

New Project GR00T workflows and AI world model tools are intended to help developers of robot dexterity, control, manipulation, and mobility. Source: NVIDIA

NVIDIA Corp. today announced new artificial intelligence and simulation tools to accelerate development of robots including humanoids. Also at the Conference for Robotic Learning, Hugging Face Inc. and NVIDIA said they are combining their open-source AI and robotics efforts to accelerate research and development.

The tools include the generally available NVIDIA Isaac Lab robot learning framework and six new robot learning workflows for the Project GR00T initiative to accelerate humanoid development. They also include new world-model development tools for video data curation and processing, including the NVIDIA Cosmos tokenizer and NVIDIA NeMo Curator for video processing.

Hugging Face said its LeRobot open AI platform combined with NVIDIA AI, Omniverse and Isaac robotics technology will enable advances across industries including manufacturing, healthcare, and logistics.


SITE AD for the 2025 Robotics Summit registration. Register now


NVIDIA Isaac Lab to help train humanoids

Isaac Lab is an open-source robot learning framework built on NVIDIA Omniverse, a platform for developing OpenUSD applications for industrial digitalization and physical AI simulation. Developers can use Isaac Lab to train policies at scale for all types of robot movement, from collaborative robots and quadrupeds to humanoids, said NVIDIA.

The company said leading research entities, robotics manufacturers, and application developers around the world are using Isaac Lab. They include 1X, Agility Robotics, The AI Institute, Berkeley Humanoid, Boston Dynamics, Field AI, Fourier, Galbot, Mentee Robotics, Skild AI, Swiss-Mile, Unitree Robotics, and XPENG Robotics.

A guide to migrating from Isaac Gym is available online, and NVIDIA Isaac Lab 1. is available now on GitHub.

Project GR00T offers blueprints for general-purpose robots

Announced at the Graphics Processing Unit Technology Conference (GTC) in March, Project GR00T aims to develop libraries, foundation models, and data pipelines to help the global developer ecosystem for humanoid robots. NVIDIA has added six new workflows coming soon to help robots perceive, move, and interact with people and their environments:

  1. GR00T-Gen for building generative AI-powered, OpenUSD-based 3D environments
  2. GR00T-Mimic for robot motion and trajectory generation
  3. GR00T-Dexterity for robot dexterous manipulation
  4. GR00T-Control for whole-body control
  5. GR00T-Mobility for robot locomotion and navigation
  6. GR00T-Perception for multimodal sensing

“Humanoid robots are the next wave of embodied AI,” said Jim Fan, senior research manager of embodied AI at NVIDIA. “NVIDIA research and engineering teams are collaborating across the company and our developer ecosystem to build Project GR00T to help advance the progress and development of global humanoid robot developers.”

Project GR00T now includes six new workflows to accelerate humanoid development, with motion models shown here.

Project GR00T now includes six new workflows to accelerate humanoid development. Source: NVIDIA

Cosmos tokenizers minimize distortion

As developers build world models, or AI representations of how objects and environments might respond to a robot’s actions, they need thousands of hours of real-world image or video data. NVIDIA said its Cosmos tokenizers provide high quality encoding and decoding to simplify the development of these world models with minimal distortion and temporal instability.

The company said the open-source Cosmos tokenizer runs up to 12x faster than current tokenizers. It is available now on GitHub and Hugging Face. XPENG Robotics, Hillbot, and 1X Technologies are using the tokenizer.

“NVIDIA Cosmos tokenizer achieves really high temporal and spatial compression of our data while still retaining visual fidelity,” said Eric Jang, vice president of AI at 1X Technologies, which has updated the 1X World Model dataset. “This allows us to train world models with long horizon video generation in an even more compute-efficient manner.”

NeMo Curator handles video data

Curating video data poses challenges due to its massive size, requiring scalable pipelines and efficient orchestration for load balancing across GPUs. In addition, models for filtering, captioning and embedding need optimization to maximize throughput, noted NVIDIA.

NeMo Curator streamlines data curation with automatic pipeline orchestration, reducing video processing time. The company said this pipeline enables robot developers to improve their world-model accuracy by processing large-scale text, image and video data.

The system supports linear scaling across multi-node, multi-GPU systems, efficiently handling more than 100 petabytes of data. This can simplify AI development, reduce costs, and accelerate time to market, NVIDIA claimed.

NeMo Curator for video processing will be available at the end of the month.

Hugging Face, NVIDIA share tools for data and simulation

Hugging Face and NVIDIA announced at the Conference for Robotic Learning (CoRL) in Munich, Germany, that they’re collaborating to accelerate open-source robotics research with LeRobot, NVIDIA Isaac Lab, and NVIDIA Jetson. They said their open-source frameworks will enable “the era of physical AI,” in which robots understand their environments and transform industry.

More than 5 million machine-learning researchers use New York-based Hugging Face’s AI platform, which includes APIs with more than 1.5 million models, datasets, and applications. LeRobot offers tools for sharing data collection, model training, and simulation environments, as well as low-cost manipulator kits.

Those tools now work with Isaac Lab on Isaac Sim, enabling robot training by demonstration or trial and error in realistic simulation. The planned collaborative workflow involves collecting data through teleoperation and simulation in Isaac Lab, storing it in the standard LeRobotDataset format.

Data generated using GR00T-Mimic will then be used to train a robot policy with imitation learning, which is subsequently evaluated in simulation. Finally, the validated policy is deployed on real-world robots with NVIDIA Jetson for real-time inference.

Initial steps in this collaboration have shown a physical picking setup with LeRobot software running on NVIDIA Jetson Orin Nano, providing a compact compute platform for deployment.

“Combining Hugging Face open-source community with NVIDIA’s hardware and Isaac Lab simulation has the potential to accelerate innovation in AI for robotics,” said Remi Cadene, principal research scientist at LeRobot.

Also at CoRL, NVIDIA released 23 papers and presented nine workshops related to advances in robot learning. The papers cover integrating vision language models (VLMs) for improved environmental understanding and task execution, temporal robot navigation, developing long-horizon planning strategies for complex multistep tasks, and using human demonstrations for skill acquisition.

Papers for humanoid robot control and synthetic data generation include SkillGen, a system based on synthetic data generation for training robots with minimal human demonstrations, and HOVER, a robot foundation model for controlling humanoid locomotion and manipulation.

Logos of NVIDIA and Hugging Face, which are collaborating on open-source AI R&D.

NVIDIA and Hugging Face, which are collaborating on open-source AI and robotics R&D. Source: NVIDIA

The post NVIDIA adds open AI and simulation tools for robot learning, humanoid development appeared first on The Robot Report.

]]>
https://www.therobotreport.com/nvidia-adds-ai-simulation-tools-robot-learning-humanoid-development/feed/ 0
Physical Intelligence raises $400M for foundation models for robotics https://www.therobotreport.com/physical-intelligence-raises-400m-for-foundation-models-for-robotics/ https://www.therobotreport.com/physical-intelligence-raises-400m-for-foundation-models-for-robotics/#respond Tue, 05 Nov 2024 01:42:37 +0000 https://www.therobotreport.com/?p=581410 Physical Intelligence is among the companies working to apply foundation models to training general-purpose robots.

The post Physical Intelligence raises $400M for foundation models for robotics appeared first on The Robot Report.

]]>
Physical Intelligence demonstrates the application of foundation models to training robots for tasks such as folding laundry and assembling cardboard boxes.

Physical Intelligence demonstrates the application of foundation models to training robots for tasks such as assembling boxes and folding laundry. Source: Physical Intelligence

Foundation models promise to give robots the ability to generalize actions from fewer examples than traditional artificial intelligence approaches. Physical Intelligence today announced that it has raised $400 million to continue its development of AI for a range of robots.

“What we’re doing is not just a brain for any particular robot,” Karol Hausman, co-founder and chief executive of Physical Intelligence, told The New York Times. “It’s a single generalist brain that can control any robot.”

The San Francisco-based company last week posted an explanation of its first generalist policy, which it claimed will make robots easier to program and use.

“To paraphrase Moravec’s paradox, winning a game of chess or discovering a new drug represent ‘easy’ problems for AI to solve, but folding a shirt or cleaning up a table requires solving some of the most difficult engineering problems ever conceived,” wrote Physical Intelligence.

“Our first step is π0, a prototype model that combines large-scale multi-task and multi-robot data collection with a new network architecture to enable the most capable and dexterous generalist robot policy to date,” said the company. “While we believe this is only a small early step toward developing truly general-purpose robot models, we think it represents an exciting step that provides a glimpse of what is to come.”

It’s still early days for foundation models

Early demonstrations of such generalist robot policies include folding laundry, assembling boxes, and dynamically putting objects into containers. Sergey Levine, an associate professor at the University of California, Berkeley, and co-founder of Physical Intelligence, shared these videos during his keynote address on building robotic foundation models at RoboBusiness last month.

Physical Intelligence acknowledged that foundation models that can control any robot to perform any task “are still in their infancy.” It said it is working on the data and partnerships to pretrain these models and enable new levels of dexterity and physical capability.

Physical Intelligence’s full research paper is available online.

Venture capital sees potential in AI plus robots

Physical Intelligence raised $70 million in seed financing earlier this year, and the company told The Robot Report that its valuation has risen to $2.4 billion. Jeff Bezos, executive chairman of Amazon, led the company’s latest funding round, along with Thrive Capital and Lux Capital.

Physical Intelligence thanked its investors on its website: “We are grateful for the support and partnership of Khosla Ventures, Lux Capital, OpenAI, Sequoia Capital, and Thrive Capital.”

The company is currently hiring.

Other recent fundraising rounds supporting work to apply foundation models to robotics include $675 million for humanoid developer Figure AI, $300 million for Skild AI, $6.6 billion for OpenAI, $100 million for Collaborative Robotics, and Accenture’s plus more recent Canadian funding of Sanctuary AI.

More companies working on AI and robotics include Covariant AI, whose team was hired by Amazon; Intrinsic, which used NVIDIA models; and Vayu, which is developing delivery robots. Apptronik is also working with NVIDIA on a general-purpose foundation model for its Apollo humanoid.


SITE AD for the 2025 Robotics Summit registration. Register now


The post Physical Intelligence raises $400M for foundation models for robotics appeared first on The Robot Report.

]]>
https://www.therobotreport.com/physical-intelligence-raises-400m-for-foundation-models-for-robotics/feed/ 0
MIT develops multimodal technique to train robots https://www.therobotreport.com/mit-develops-multimodal-technique-to-train-robots/ https://www.therobotreport.com/mit-develops-multimodal-technique-to-train-robots/#respond Tue, 29 Oct 2024 15:35:57 +0000 https://www.therobotreport.com/?p=581322 Inspired by large language models, researchers develop a training technique that pools diverse data to teach robots new skills.

The post MIT develops multimodal technique to train robots appeared first on The Robot Report.

]]>
MIT researchers developed a multimodal technique to help robots learn new skills.

Researchers filmed multiple instances of a robot arm feeding a dog. The videos were included in datasets to train the robot. | Credit: MIT

Training a general-purpose robot remains a major challenge. Typically, engineers collect data that are specific to a certain robot and task, which they use to train the robot in a controlled environment. However, gathering these data is costly and time-consuming, and the robot will likely struggle to adapt to environments or tasks it hasn’t seen before.

To train better general-purpose robots, MIT researchers developed a versatile technique that combines a huge amount of heterogeneous data from many of sources into one system that can teach any robot a wide range of tasks.

Their method involves aligning data from varied domains, like simulations and real robots, and multiple modalities, including vision sensors and robotic arm position encoders, into a shared “language” that a generative AI model can process.

By combining such an enormous amount of data, this approach can be used to train a robot to perform a variety of tasks without the need to start training it from scratch each time.

This method could be faster and less expensive than traditional techniques because it requires far fewer task-specific data. In addition, it outperformed training from scratch by more than 20% in simulation and real-world experiments.

“In robotics, people often claim that we don’t have enough training data. But in my view, another big problem is that the data come from so many different domains, modalities, and robot hardware. Our work shows how you’d be able to train a robot with all of them put together,” said Lirui Wang, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on this technique.

Wang’s co-authors include fellow EECS graduate student Jialiang Zhao; Xinlei Chen, a research scientist at Meta; and senior author Kaiming He, an associate professor in EECS and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). 

MIT researchers developed a multimodal technique to help robots learn new skills.

This figure shows how the new technique aligns data from varied domains, like simulation and real robots, and multiple modalities, including vision sensors and robotic arm position encoders, into a shared “language” that a generative AI model can process. | Credit: MIT

Inspired by LLMs

A robotic “policy” takes in sensor observations, like camera images or proprioceptive measurements that track the speed and position a robotic arm, and then tells a robot how and where to move.

Policies are typically trained using imitation learning, meaning a human demonstrates actions or teleoperates a robot to generate data, which are fed into an AI model that learns the policy. Because this method uses a small amount of task-specific data, robots often fail when their environment or task changes.

To develop a better approach, Wang and his collaborators drew inspiration from large language models like GPT-4.

These models are pretrained using an enormous amount of diverse language data and then fine-tuned by feeding them a small amount of task-specific data. Pretraining on so much data helps the models adapt to perform well on a variety of tasks.

“In the language domain, the data are all just sentences. In robotics, given all the heterogeneity in the data, if you want to pretrain in a similar manner, we need a different architecture,” he said.

Robotic data take many forms, from camera images to language instructions to depth maps. At the same time, each robot is mechanically unique, with a different number and orientation of arms, grippers, and sensors. Plus, the environments where data are collected vary widely.


SITE AD for the 2025 Robotics Summit registration. Register now


The MIT researchers developed a new architecture called Heterogeneous Pretrained Transformers (HPT) that unifies data from these varied modalities and domains.

They put a machine-learning model known as a transformer into the middle of their architecture, which processes vision and proprioception inputs. A transformer is the same type of model that forms the backbone of large language models.

The researchers align data from vision and proprioception into the same type of input, called a token, which the transformer can process. Each input is represented with the same fixed number of tokens.

Then the transformer maps all inputs into one shared space, growing into a huge, pretrained model as it processes and learns from more data. The larger the transformer becomes, the better it will perform.

A user only needs to feed HPT a small amount of data on their robot’s design, setup, and the task they want it to perform. Then HPT transfers the knowledge the transformer grained during pretraining to learn the new task.

Enabling dexterous motions

One of the biggest challenges of developing HPT was building the massive dataset to pretrain the transformer, which included 52 datasets with more than 200,000 robot trajectories in four categories, including human demo videos and simulation.

The researchers also needed to develop an efficient way to turn raw proprioception signals from an array of sensors into data the transformer could handle.

“Proprioception is key to enable a lot of dexterous motions. Because the number of tokens is in our architecture always the same, we place the same importance on proprioception and vision,” Wang explained.

When they tested HPT, it improved robot performance by more than 20% on simulation and real-world tasks, compared with training from scratch each time. Even when the task was very different from the pretraining data, HPT still improved performance.

“This paper provides a novel approach to training a single policy across multiple robot embodiments. This enables training across diverse datasets, enabling robot learning methods to significantly scale up the size of datasets that they can train on. It also allows the model to quickly adapt to new robot embodiments, which is important as new robot designs are continuously being produced,” said David Held, associate professor at the Carnegie Mellon University Robotics Institute, who was not involved with this work.

In the future, the researchers want to study how data diversity could boost the performance of HPT. They also want to enhance HPT so it can process unlabeled data like GPT-4 and other large language models.

“Our dream is to have a universal robot brain that you could download and use for your robot without any training at all. While we are just in the early stages, we are going to keep pushing hard and hope scaling leads to a breakthrough in robotic policies, like it did with large language models,” he said.

Editor’s Note: This article was republished from MIT News.

The post MIT develops multimodal technique to train robots appeared first on The Robot Report.

]]>
https://www.therobotreport.com/mit-develops-multimodal-technique-to-train-robots/feed/ 0
Universal Robots AI Accelerator offers to ease development of cobot applications https://www.therobotreport.com/ur-ai-accelerator-designed-ease-development-ai-driven-cobots/ https://www.therobotreport.com/ur-ai-accelerator-designed-ease-development-ai-driven-cobots/#respond Thu, 24 Oct 2024 17:45:36 +0000 https://www.therobotreport.com/?p=581257 The UR AI Accelerator offers reference hardware and software for faster development of cobot applications such as picking and palletizing.

The post Universal Robots AI Accelerator offers to ease development of cobot applications appeared first on The Robot Report.

]]>
The Universal Robots AI Accelerator Kit includes reference hardware and software.

The UR AI Accelerator toolkit includes reference hardware and software. Source: Universal Robots

The latest advances in artificial intelligence promise to improve robot capabilities, but engineers need to bring the technologies together. Universal Robots A/S last week announced its UR AI Accelerator, a hardware and software toolkit to enable the development of AI-powered collaborative robot applications.

The UR AI Accelerator is an extensible platform for developers to build commercial and research applications, said the Odense, Denmark-based company. It is also intended to accelerate research and reduce the time to market for AI products, Universal Robots said at ROSCon.

“If you’re building solutions on our platform, it will decrease your time to deployment while also de-risking the development of AI-based solutions,” stated James Davidson, chief AI officer at Teradyne Robotics, parent organization of Universal Robots.

“People spend an enormous amount of time on the connective tissue of these systems — selecting hardware, finding compute and cameras, and on compliance,” he told The Robot Report. “On the software side, developers have to decide what model to use and how to optimize it with the hardware.”

“We’ve pulled that together in a platform for reference applications and libraries to build solutions quickly,” Davidson added. “That way, developers can choose what toolsets and languages they want to use and spend their time on high value-added capabilities.”

AI Accelerator runs NVIDIA models for faster deployment

NVIDIA Isaac libraries and models running on the NVIDIA Jetson AGX Orin system on module bring AI acceleration to Universal Robots’ PolyScope X software. Isaac Manipulator can enhance cobot performance, and the toolkit includes the new Orbbec Gemini 335Lg 3D camera, the company added.

Universal Robots said the accelerator provides built-in demonstration programs to enable pose estimation, tracking, object detection, path planning, image classification, quality inspection, state detection, and more. While the UR+ ecosystem offers compatible hardware for cobot applications, the AI Accelerator provides functionality options.

“The AI Accelerator is fundamentally designed to make robots easier to deploy for inspection, picking, and other functions,” said Davidson. “We’re trying to bring up the utility of solutions to be leveraged more quickly by partners.”

“If you’re a roboticist, with component AI for reference projects, you can do dynamic motion planning and obstacle awareness,” he noted. “If you’re an AI developer, we have robotics components through ROS with NVIDIA wrapped around it, and you can go to our ecosystem.”

Universal Robots focuses on cobot app dev

For its AI Accelerator, Universal Robots identified enablers that it could put together to solve application problems by moving up the stack, Davidson explained.

“We vetted the AI components and designed reference examples for the hardware and software package,” he said. “The time saved depends on the application. This can have a huge impact on things like inspection or palletizing. Building a foundation model is very different from learning physical AI or integration.”

The platform is designed to be flexible. It can use simulation and digital twins, but it doesn’t require them, said Davidson.

“Think of simulation as addressing problems — synthetic data generation can bridge the sim-to-real gap, primarily in visual applications,” he said. “We are hopefully enabling customers to go further into the physical side.”

NVIDIA and partners such as Universal Robots, whose cobot is shown here, are working on tying AI to robotics.

Universal Robots and partners such as NVIDIA offer an AI Accelerator to ease cobot development. Source: Universal Robots

AI Accelerator, PolyScope X just the first steps

Universal Robots plans to show systems developed with its platform and AI at its PolyScope X Festival next month. The company, which claimed that it has sold more than 90,000 cobot arms worldwide, said PolyScope X is compatible with its e-Series and UR20 and UR30 cobots.

Davidson added that AI is starting to enable robots to be more reliable and to perform a wider range of tasks.

“Instead of programming a robot for piece picking, we’re moving to functional manipulation and more complex assembly tasks,” he said. “For a robot to grab a cup, it needs to handle multisensory inputs, such as visual and tactile data. Feedback and closed-loop systems will be key.”

“With our objective to take physical AI to an entirely new level, AI Accelerator is just the first to market of a series of AI-powered products and capabilities in UR’s pipeline, all with the focused goal of making robotics more accessible than ever before,” said Davidson. “We’re engaged with customers and partners.”

“The hype around AI is starting to die down, but all agree that it is continuing to make progress,” he said. “We’re at an inflection point. When I asked at trade shows two to three years ago how many people were using AI, only one or two people raised their hands. Now, 75% of people raise their hands.”

The post Universal Robots AI Accelerator offers to ease development of cobot applications appeared first on The Robot Report.

]]>
https://www.therobotreport.com/ur-ai-accelerator-designed-ease-development-ai-driven-cobots/feed/ 0